Announcement

Collapse
No announcement yet.

New SMP build

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • New SMP build

    Hi all

    I'm in the process of replacing my Dual SMP AMD MPX system at the moment. I plan on staying with SMP, but with multi-core now, of course!

    My plan is to build a dual Quad Xeon system with either 4 or 8GB of RAM. I run 50/50 Win XP Pro and a custom Linux distro based loosely on LFS. As the days go by I spend more time with Linux than I do with Windows.

    I have the following on my shopping list so far:

    Tyan S5396A2NRF i5400, S771 x 2, PCI-E (x16), DDR2 ECC 533/667 MHz, SATA II, SATA RAID, E-ATX/SSI
    Intel Xeon E5420A Quad Core, S771, Harpertown Core, 2.5GHz, FSB 1333MHz, 12MB Cache
    Crucial CT2KIT25672AF667 4GB kit (2GBx2), 240-pin FB-DIMM DDR2 PC2-5300
    Gainward Bliss 8800 GT 1024MB GS NVIDIA 8800 GT, 1024MB TV, DVI-DVI GS PCI-E 2.0, Mem 1900MHz GDDR3, GPU 650MHz
    Enermax EG1000EWLDXX 1000W V2 Galaxy Modular PSU 85% Efficiency EPS12v Triple Quad +24 Rails Silent x2 Fan
    Silverstone SST-TJ10S Temjin Aluminium Tower Chassis in silver, RoHS


    I'd be pleased to read your comments, good and bad. I'm particularly interested if you notice anything glaringly obvious. For example, it's only through reading the Phoronix reviews that I realised that I had to get an SSI-compatible case to support the weighty XEON heatsinks (thank you Phoronix!).

  • #2
    Don't Barcelona-class AMD processors scale better, as far as SMP goes? They have more memory bandwidth, too. My Phenom's pbzip2 results are much better than that of Intel CPUs. The latter fare better in other areas, notably where L2 cache size matters, but since you're building a dual-quad rig…
    Last edited by apaige; 05-16-2008, 05:54 PM.

    Comment


    • #3
      Originally posted by apaige View Post
      Don't Barcelona-class AMD processors scale better, as far as SMP goes? They have more memory bandwidth, too. My Phenom's pbzip2 results are much better than that of Intel CPUs. The latter fare better in other areas, notably where L2 cache size matters, but since you're building a dual-quad rig…
      that is some rather interresting results...
      http://global.phoronix-test-suite.co...636-9602-14335
      http://global.phoronix-test-suite.co...51-11474-15615
      http://global.phoronix-test-suite.co...383-27337-1173

      Comment


      • #4
        Same test (pts 0.5.0), Phenom 9600 (2.3GHz), 4 cores, gcc 4.3.0:
        Code:
        Parallel BZIP2 v1.0.2 - by: Jeff Gilchrist [http://compression.ca]
        [July 25, 2007]             (uses libbzip2 by Julian Seward)
        
                 # CPUs: 4
         BWT Block Size: 500k
        File Block Size: 900k
        -------------------------------------------
                 File #: 1 of 1
             Input Name: bigfile
            Output Name: bigfile.bz2
        
             Input Size: 691505952 bytes
        Compressing data...
            Output Size: 425971060 bytes
        -------------------------------------------
        
             Wall Clock: 38.688440 seconds
        38.7s versus 57s for kte's Phenom 9850 @2.7GHz and 69.8s for khurios' Phenom 9500 @2.2GHz. Maybe his hard drive was the bottleneck, maybe his RAM was set in ganged mode, maybe the TLB bug patch was enabled, I don't know. GCC 4.3.x also provides performance gains with recent processors such as the Phenom and the Core 2 Duo/Quad CPUs.
        But while 20.6s for your C2Q @3.2GHz is nothing to sneeze at, I've seen very different results for more comparable CPUs such as the Q6600. It's hard to find comparable data because the benchmark file changed so often - it'll be easier once pts 1.0 comes out with a definitive file.

        Anyway, I don't have any experience with SMP systems with more than 4 cores. I just read that while AMD has had very little success on the consumer PC front, it currently has an edge in the server and HPC markets. http://www.anandtech.com/weblog/showpost.aspx?i=443
        I guess what's important is to clearly identify the target usage, and determine which offering would perform better in the relevant scenarios. The OP hasn't stated what those would be.

        Comment


        • #5
          AMD does have more bandwidth available yes.

          Comment


          • #6
            Once more, there are very odd results in the PTS database.
            PTS 0.7.0 (latest), multicore benchmarks: Phenom 9850 @2.50GHz vs. Core 2 Quad Q6600 @3.38GHz. The much higher-clocked Intel CPU is slower than the Phenom in ALL benchmarks. That's got to be wrong. And while the p7zip bench result is only mildly lower, the other ones are MUCH lower. There's gotta be a bottleneck somewhere (the hard drive perhaps?). EDIT: then again, the p7zip benchmark doesn't involve the hard drive at all.

            Even the Phenom results are somewhat surprising. My lower-clocked Phenom (200MHz per core slower) scores about 6000 (vs. 4653) with the benchmark settings (i.e. p7zip compiled without optimizations), while the OpenSSL result somewhat matches my expectations (162 on mine vs. 167). With optimizations (gcc 4.3.0, -O3 -march=amdfam10 and the special amd64 makefile present in the package), p7zip scores 6891. Which brings me to another issue I'd like to mention: the lack of optimization in pts builds (but I guess that's for another thread).
            Last edited by apaige; 05-17-2008, 09:53 AM.

            Comment


            • #7
              Many benchmarks use tempfiles. Thats one of the biggest drawbacks when you compare cpu speed.

              Comment


              • #8
                Many of those could be modified to output to /dev/null (that's the case now for the audio encoding profiles). Still, how do you explain the p7zip discrepancies? That benchmark doesn't use tempfiles, as far as I can tell.

                Comment


                • #9
                  p7zip is no good mulitcore benchmark, a dual core E6600@3.2 ghz has already 3800 MIPS. Maybe the C2Q got too hot and throttled down.

                  Comment


                  • #10
                    There a few things I notice about the p7zip benchmark that can dramatically effect it's performance.

                    -scheduler choice
                    -optimization builds
                    -tlb workarounds for phenoms
                    -assembly or non assembly compiles
                    -motherboard chipsets

                    Also the Phenom system listed by apaige in the link also is using a rt kernel.
                    Last edited by deanjo; 05-17-2008, 12:45 PM.

                    Comment


                    • #11
                      That's all well and good, but op is considering building a dual quad core processor system, and since Intel has its FSB and AMD has HT, HT wins hands down as far as bandwidth is concerned. That advantage could change depending on intended workload, but until Intel releases Nehalem with quickpath, they can't compete on multiprocessor systems where lots of data needs to be shared.

                      Comment


                      • #12
                        So here is a nice benchmark with an OC Q6600:

                        http://global.phoronix-test-suite.co...63-22205-28735 m

                        Compared against Phenom - this time the values are more logical...

                        Comment


                        • #13
                          Unfortunately, none of the PTS tests are of much relevance when we start talking server use that would exploit SMP systems and show the bottlenecks that FSB has. Start getting some VM / SQL / apache and the likes tests in there and then you could potentially start seeing the difference.

                          Comment


                          • #14
                            Thanks for the many replies.

                            Hmm, I'm wondering now if I should hold out to Nehalem - according to this Wiki http://en.wikipedia.org/wiki/Nehalem_(CPU_architecture) it's a big a change of architecture as the PPro was. If that's true, it's very significant.

                            I'd totally forgotten that Intel still doesn't use point to point communication - yet.

                            I may have to go back and investigate an AMD option based on the comments here

                            Comment


                            • #15
                              Originally posted by fluffy_bunny View Post
                              Thanks for the many replies.

                              Hmm, I'm wondering now if I should hold out to Nehalem - according to this Wiki http://en.wikipedia.org/wiki/Nehalem_(CPU_architecture) it's a big a change of architecture as the PPro was. If that's true, it's very significant.

                              I'd totally forgotten that Intel still doesn't use point to point communication - yet.

                              I may have to go back and investigate an AMD option based on the comments here
                              Until AMD has their answer to Nehalem you can be guaranteed that intel will price those at insane dollars.

                              Comment

                              Working...
                              X