Announcement

Collapse
No announcement yet.

Intel Continues Optimizing Linux Memory Placement For Optane DC Persistent Memory

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by atomsymbol
    Yes, but if someone was serious about the distinction between bandwidth and latency he/she could devise a programming language with a type system that can distinguish between the two and thus can allocate either high-bandwidth memory or low-latency memory from the operating system. Maybe this will ultimately happen to programming languages as a natural/necessary step as technology evolves over time.
    Afaik this is already dealt with by kernel schedulers that will move RAM to the closest possible place. Also isn't this NUMA-aware programming? Does that exist already?

    In multi-CPU servers where each CPU only has half or even 1/4 of the total server RAM. It's obvious that the threads running in that CPU will very much like to have their own RAM allocated in the RAM banks of that CPU and not on the RAM of another CPU so it does not need to have to request RAM reads and writes to another CPU through the CPU interconnect bus and can use the memory controller directly.

    Comment


    • #12
      Originally posted by starshipeleven View Post
      For the same reason that a Pentium D running at 3.6 Ghz (it's a dualcore) does not have anywhere near the same performance as a random modern dualcore CPU with the same clock speed (probably still called Pentium I guess).

      Hz is Hertz, it measures frequency. It measures how many times "something" happens per second. Frequency alone is meaningless as you don't know how much work is done each "something" happens.

      The modern CPU in the example can process MUCH more information per cycle than the Pentium D. Therefore it is faster in practice.

      RAM frequency says how many times RAM chips do a full cycle per second, but the actual speed is how much information they can move around each cycle, multiplied by the cycles per second.

      I don't feel like looking up the actual speeds, but in practice the bandwith (GB/s) that RAM can provide are still not anywhere near the speed of an onboard high speed SRAM cache that is sitting at a few microns or (in case of eDRAM aka external cache from the die) even a few millimeters from the CPU core.

      Also there is latency to take into account. Calling up information from a chip that is electrically at a few cm from the core incurs a higher wait time for the request to reach the chip, be processed and be sent over to the CPU than the same happening with a much closer cache. For the speeds we are talking about, even a few cm matter. Notice how RAM modules are always as close as possible to the CPU.

      Plus the cache used in the CPU is using a different memory technology, SRAM https://en.wikipedia.org/wiki/Static...-access_memory that is much faster than DRAM (what is used in RAM modules) just because it is a different kind of electrical circuit.
      Thanks, Starshipeleven!

      Comment


      • #13
        Originally posted by hoohoo View Post
        DRAM currently can be clocked at the nearly same frequency (transfer rate I suppose) as the CPU itself. By which I mean I am typing this in a PC with a 3800 MHz CPU and DDR4/3600 DRAM. The days of a double pumped 100MHz interface between CPU & chipset thus DRAM also are long long gone.
        first letter in dram means double-data-rate. i.e. 3800 is a marketing frequency, real frequency is 1900
        Originally posted by hoohoo View Post
        The path between registers and DRAM is shorter and faster now than it used to be. So why do we need even more levels of cache?
        dram has huge latency. recall timings, those are numbers of cycles
        Originally posted by hoohoo View Post
        Optane qua persistent memory is great. As a replacement for flash SSD storage it is an absolute killer product and it cannot replace flash fast enough IMO*.
        flash is much cheaper than optane. i'm actually waiting for samsung flash ssd with real pcie4 controller to put it in system which already has optane and hard drives.
        Originally posted by hoohoo View Post
        Inserting it as a cache between DRAM and CPU seems like adding unneeded complexity.
        and optane is cheaper than dram. that's how caches work. i mean, it's cache but on other side - between ram and other storage

        Comment


        • #14
          Originally posted by pal666 View Post
          first letter in dram means double-data-rate. i.e. 3800 is a marketing frequency, real frequency is 1900
          dram has huge latency. recall timings, those are numbers of cycles
          No:



          DDR stands for Double Data Rate, DRAM means something else.

          And "3800 is a marketing frequency, real frequency is 1900" is wrong, at least in the way you and several others have implied it's some sort of scam:



          An easy way of thinking about is is that since more data is being transferred per clock cycle vs SDR the net effect is as if DDR was running at twice the clock speed, similar to how SIMD packs more instructions and data into a single operation vs a regular x86 instruction.

          Comment


          • #15
            Originally posted by Spooktra View Post
            DDR stands for Double Data Rate, DRAM means something else.
            sure. i probably shortened ddr dram to dram while writing it
            Originally posted by Spooktra View Post
            And "3800 is a marketing frequency, real frequency is 1900" is wrong, at least in the way you and several others have implied it's some sort of scam:

            well, it's not a frequency. i mean at all. it can send twice during cycle, but (since comparison was vs) modern cpus can do several instructions per cycle, but nobody is smart enough to market them as 16 ghz. other examples are 1200 bullshit hz tvsets, or chinese wifi adding together frequencies of all antennas. and chinese socks adding together frequencies of all cores
            Last edited by pal666; 19 February 2020, 03:12 PM.

            Comment


            • #16
              Originally posted by pal666 View Post
              and chinese socks adding together frequencies of all cores
              Woah, even their socks are multi-core now, the Chinese are far more advanced than I realized.

              Comment


              • #17
                Originally posted by pal666 View Post
                ... and optane is cheaper than dram. that's how caches work. i mean, it's cache but on other side - between ram and other storage
                Thanks, man. I misunderstood the hierarchy. I thought they were putting the Optane between the CPU and RAM, rather than between RAM and other storage.

                Comment

                Working...
                X