Announcement

Collapse
No announcement yet.

quad channel memory not working as quad channel

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • quad channel memory not working as quad channel

    I hope someone can help me with the following:
    I assembled a machine with the following parts:

    ASUS KGPE-D16 Dual socket 1944 (G34) motherboard
    One 8-core opteron 6128 CPU, 2.0GHz. Running at 800MHz if not loaded because of powersaving features.
    8 sticks of ddr3, 1333MHz, quad channel, non-ECC, non-registered memory. Brand is GeIL.

    The problem is that the benchmarks of this machine are not very good. I think the memory performance is the problem. I think that the quad-channel memory is actually functioning as single channel only.
    I calculate the theoretical speed of this memory as follows:

    4 channels * 64 bits * 1333MHz = 341.248 Gbit/s = 42.656 GB/s

    Is this a correct calculation?

    I only get about 1/8 of that number: 5100 MB/s or so . I ran the phoronix pts/ramspeed benchmark, results follow below the line.
    So what is going on here? Am I maybe seeing the speed _per core_ only, so I should actually multiply by 8?
    I don't think it works that way but it would explain the factor of 8.
    Or is it the BIOS just does not recognize this brand of GeIL memory as quad channel?
    Then it should still give me 1/4 of 42.656GB/s = 10.664GB/s. Or would that be because the CPU's clock themselves down to 800 MHz, there by cutting memory speed in less than half as well? Again I don't know if it works that way.
    Should I have bought ECC, registered memory for this server/workstation board? I wanted lots of memory and speed was critical but integrity not _that_ much to justify the significantly higher cost...

    If someone can give me some guidance here please, I would be most obliged...

    ----------------------------------------------------------------------
    RAMspeed SMP:
    pts/ramspeed-1.4.0 [Type: Copy - Benchmark: Integer]
    Test 1 of 10
    Expected Trial Run Count: 1
    Started Run 1 @ 08:46:48

    Test Results:
    4517.8

    Average: 4517.80 MB/s


    RAMspeed SMP:
    pts/ramspeed-1.4.0 [Type: Copy - Benchmark: Floating Point]
    Test 2 of 10
    Estimated Time Remaining: 59 Minutes
    Estimated Test Run-Time: 7 Minutes
    Expected Trial Run Count: 1
    Started Run 1 @ 08:53:31

    Test Results:
    4353.48

    Average: 4353.48 MB/s


    RAMspeed SMP:
    pts/ramspeed-1.4.0 [Type: Scale - Benchmark: Integer]
    Test 3 of 10
    Estimated Time Remaining: 53 Minutes
    Estimated Test Run-Time: 7 Minutes
    Expected Trial Run Count: 1
    Started Run 1 @ 09:00:13

    Test Results:
    5222.69

    Average: 5222.69 MB/s

    RAMspeed SMP:
    pts/ramspeed-1.4.0 [Type: Scale - Benchmark: Floating Point]
    Test 4 of 10
    Estimated Time Remaining: 44 Minutes
    Estimated Test Run-Time: 7 Minutes
    Expected Trial Run Count: 1
    Started Run 1 @ 09:05:56

    Test Results:
    4836.58

    Average: 4836.58 MB/s

    RAMspeed SMP:
    pts/ramspeed-1.4.0 [Type: Add - Benchmark: Integer]
    Test 5 of 10
    Estimated Time Remaining: 37 Minutes
    Estimated Test Run-Time: 7 Minutes
    Expected Trial Run Count: 1
    Started Run 1 @ 09:12:04

    Test Results:
    5337.97

    Average: 5337.97 MB/s


    RAMspeed SMP:
    pts/ramspeed-1.4.0 [Type: Add - Benchmark: Floating Point]
    Test 6 of 10
    Estimated Time Remaining: 31 Minutes
    Estimated Test Run-Time: 7 Minutes
    Expected Trial Run Count: 1
    Started Run 1 @ 09:17:46

    Test Results:
    5595.58

    Average: 5595.58 MB/s

    RAMspeed SMP:
    pts/ramspeed-1.4.0 [Type: Triad - Benchmark: Integer]
    Test 7 of 10
    Estimated Time Remaining: 24 Minutes
    Estimated Test Run-Time: 6 Minutes
    Expected Trial Run Count: 1
    Started Run 1 @ 09:23:18

    Test Results:
    5161.8

    Average: 5161.80 MB/s


    RAMspeed SMP:
    pts/ramspeed-1.4.0 [Type: Triad - Benchmark: Floating Point]
    Test 8 of 10
    Estimated Time Remaining: 18 Minutes
    Estimated Test Run-Time: 6 Minutes
    Expected Trial Run Count: 1
    Started Run 1 @ 09:29:06

    Test Results:
    4475.21

    RAMspeed SMP:
    pts/ramspeed-1.4.0 [Type: Average - Benchmark: Integer]
    Test 9 of 10
    Estimated Time Remaining: 13 Minutes
    Estimated Test Run-Time: 7 Minutes
    Expected Trial Run Count: 1
    Started Run 1 @ 09:35:52

    Test Results:
    5235.23

    Average: 5235.23 MB/s

    RAMspeed SMP:
    pts/ramspeed-1.4.0 [Type: Average - Benchmark: Floating Point]
    Test 10 of 10
    Estimated Time Remaining: 6 Minutes
    Expected Trial Run Count: 1
    Started Run 1 @ 09:41:40

    Test Results:
    4905.06

    Average: 4905.06 MB/s

  • #2
    Originally posted by perpetualrabbit View Post
    One 8-core opteron 6128 CPU, 2.0GHz.
    I believe you only get quadchannel when both CPU sockets are populated.

    Comment


    • #3
      Originally posted by deanjo View Post
      I believe you only get quadchannel when both CPU sockets are populated.
      aha, thanks. Can you point me to some site or documentation on this? Or explain the logic behind your thinking? Something to do with NUMA perhaps?
      I read the motherboard manual back to back and back but I could not find anything about it.
      I have the second CPU lying around, but the cooler is too high, it does not fit under the harddisk rack above it. So I'll have to take a iron saw to my case to make it fit somehow. That's why I have not come around to install the second CPU and the second 16GB yet.

      Comment


      • #4
        My impression was that the 6128 brought all 4 memory channels (2 from each die) out to the socket so as long as you had memory hooked up to all 4 channels you would get 4-channel performance. Haven't looked really closely at it though...

        I guess the obvious question is whether your mobo has a separate set of memory sockets for the second CPU. I expect that it would, but...

        EDIT - yeah, looks like your mobo has 8 sockets per CPU so as long as you have the RAM in the right sockets (and maybe the right BIOS setup ?) you should get 4 channels. I think. If you had memory in the wrong sockets then I guess you wouldn't see all the memory
        Last edited by bridgman; 25 August 2011, 09:57 AM.
        Test signature

        Comment


        • #5
          Please post proper 'complete' technical spec for your system (bios revision, RAM timings, ram model #).

          Also run the latest memtestx86+ (http://www.memtest.org/), report results.

          IIRC there is no such thing as AMD 'Quad' Channel ram access.

          The memory controller is DUAL CHANNEL ONLY.

          However both CPU cores could combine their NUMA Hypertransport channels when reading - I'd doubt the RAM is paged out in this fashion.

          Its dual channel with 4 CPUs basically.

          Comment


          • #6
            Hey guys,

            has anyone found a solution to it by now?
            I am just asking, because I have the same problem with a different board. I am running a Supermicro H8SGL (single socket) with an Opteron 6128 and eight G.Skill PC3-10666 modules (4GB each, unregistered, no ECC). With a quad-channel memory link, I should get 10666 MB/s*4 = approx. 40GB/s of bandwidth, am I right? However using Ramspeed/SMP 3.5.0, I get an average performance of 16378.76 MB/s with the Intmem test. Using SiSoftware Sandra on Win7, I get something around 22GB/s which is the bandwidth I would expect from a dual channel setup. The Supermicro support finally answered after one month without response, but there is still no solution.

            What bridgman pointed out sounds logical, but can you use the second block of memory slots if there is no second CPU installed?

            @perpetualrabbit: I also recommend checking your BIOS settings again. I turned on Channel interleaving, Bank interleaving and Bank Swizzle, while I left Node Interleaving turned off.
            Moreover, it looks like you either use the single core version of ramspeed (not the SMP version) or you forgot the -p option which specifies the number of processes to be spawned. I get the same results, if I use "-p 1".

            @Melchior: Do you mean that the RAM itself has to support "double dual-channel"?

            Comment

            Working...
            X