Announcement

Collapse
No announcement yet.

Radeon ROCm 4.0.1 Released For AMD Open-Source GPU Compute

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Radeon ROCm 4.0.1 Released For AMD Open-Source GPU Compute

    Phoronix: Radeon ROCm 4.0.1 Released For AMD Open-Source GPU Compute

    Last month marked the release of the big Radeon Open eCosystem 4.0 update (ROCm 4.0) while today that has been replaced by a v4.0.1 point release...

    http://www.phoronix.com/scan.php?pag...eon-ROCm-4.0.1

  • #2
    Well still waiting on Navi.

    I do wonder what AMD's approach to Thread Ripper will be. I'd love to see a dual socket board, one for traditional Thread Ripper cores and the other for an Arcturus derived chip.

    Comment


    • #3
      I wonder if they even care anymore

      Comment


      • #4
        I remember the github thread where they said they are working on Navi support

        Comment


        • #5
          We're working from the bottom up - we just finished replacing the PAL back end for OpenCL with a ROCm back end in the 20.45 drivers and fixing a bunch of OpenCL bugs in the process. HIP is next, then libraries.

          Originally posted by wizard69 View Post
          I do wonder what AMD's approach to Thread Ripper will be. I'd love to see a dual socket board, one for traditional Thread Ripper cores and the other for an Arcturus derived chip.
          Yep, I always liked that 4x4 / Terrazo initiative we talked about. The Arcturus/MI100 chip would probably have to be soldered down though... sockets for something with that many pins cost $10-20K and are about the size of a basketball
          Last edited by bridgman; 25 January 2021, 10:32 PM.

          Comment


          • #6
            Originally posted by bridgman View Post
            We're working from the bottom up - we just finished replacing the PAL back end for OpenCL with a ROCm back end in the 20.45 drivers and fixing a bunch of OpenCL bugs in the process. HIP is next, then libraries.
            Is it likely we could see navi support this year?

            Comment


            • #7
              Originally posted by bridgman View Post

              Yep, I always liked that 4x4 / Terrazo initiative we talked about. The Arcturus/MI100 chip would probably have to be soldered down though... sockets for something with that many pins cost $10-20K and are about the size of a basketball
              like not sure if the market would go for it but a soldered in chip isn’t all that bad of an idea. It isn’t like Thread Ripers get sold to upgrade crazy users. What I would hope for is a rendition of AMDs super computing solutions cut down for the desktop. That is air cooled and wall socket powered but otherwise “compatible”. Something that starts at $6000 would likely be a hot seller.

              Comment


              • #8
                Originally posted by bridgman View Post
                Yep, I always liked that 4x4 / Terrazo initiative we talked about. The Arcturus/MI100 chip would probably have to be soldered down though... sockets for something with that many pins cost $10-20K and are about the size of a basketball
                is there no way to reduce the pins ?? in the past i remember the time AMD did go from FSB(front side bus) to hyberlink (is this the right name?) and this resultet in lower pins numbers.
                i think there is many ways to reduce the pins many already happen others is not yet tried.
                for example if you put RAM and SSD inside the CPU then the cpu maybe goes bigger but the Pins numbers go down. i think bigger cpu size is not the main problem but the lots of pins are in fact a problem.
                i also think if you put a Arcturus/MI100 chip into a cpu then in theory the pins numbers also should go down.
                yes in total it would go up but not compared to 2 seperat chips.

                if AMD would build a "CPU" with RAM+SSD+GPU+FPGA inside then it should reduce the total number of pins...

                yes right now this sounds futuristic but amd bought FPGA-Xilinx so this idea is not so crazy in the end.

                with Parallax_Propeller they did same if we look at the 8core Parallax_Propeller chips they use very old pins system with very low pins numbers they solved it the same way by putting ram and ssd/flash on the chip.
                https://en.wikipedia.org/wiki/Parallax_Propeller
                by this the pins number goes down.

                without these tricks it would be impossible to put a 8 core Parallax Propeller on a 40-Pin design.
                Phantom circuit Sequence Reducer Dyslexia

                Comment


                • #9
                  Number of pins isn't much of a problem these days, but certainly the number of long traces between chips is still a challenge. It's easy to make them reliable but you have to slow down the data rates and/or increase the power.

                  We would certainly use Infinity Fabric (GMI / XGMI) to connect the chips, and Infinity Fabric is essentially an enhanced HyperTransport protocol.

                  Comment


                  • #10
                    Originally posted by bridgman View Post
                    Number of pins isn't much of a problem these days, but certainly the number of long traces between chips is still a challenge. It's easy to make them reliable but you have to slow down the data rates and/or increase the power.
                    We would certainly use Infinity Fabric (GMI / XGMI) to connect the chips, and Infinity Fabric is essentially an enhanced HyperTransport protocol.
                    well yes HyperTransport was the start and xGMI is the today standard.

                    to reduce the long traces between chips it would be needed to put CPU+RAM+SSD+GPU+FPGA very close together. right now we only have APU who put CPU+GPU together or in RDNA2 a GPU+infinity cache.

                    i think together with a water cooling solution to put CPU+RAM+SSD+GPU+infinity cache+FPGA very close together on a mainboard it could result in great performance potential.

                    to do all this (CPU+RAM+SSD+GPU+infinity cache+FPGA ) at one time is maybe to much

                    but CPU+GPU+HBM3 ram+Infinity cache+FPGA would be a good start.
                    Phantom circuit Sequence Reducer Dyslexia

                    Comment

                    Working...
                    X