Rusticl OpenCL Driver Nearing Cross-Vendor Shared Virtual Memory Support

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • oiaohm
    replied
    Originally posted by ultimA View Post
    Indeed it does. I said in this case there's no reason to make it cross-vendor, not that there's no reason for SVM at all in this case.
    Cross vendor does make sense you do run into cases where it can be important. Its not like Intel, AMD and Nvidia are the same silicon design. There are things that process faster in AMD gpu than Intel or Nvidia and so on.

    Yes the opencl performance depends on the opencl implementation and the limitation of the hardware. There will be particular cases where the p2p memory transfer is cost problem because of the difference in operational speeds..

    ultmA think prime where you are rendering on one GPU and outputing on another this is a p2p memory setup that avoids going by the CPU memory. Lot of ways since we have had DMABUF for this it make sense to allow compute workloads running on different gpu to share data the same way and in a GPU neutral way.

    Leave a comment:


  • coder
    replied
    Originally posted by ultimA View Post
    I said, instead they are instead interested in being the most wide-spread and outperforming others. Are you saying Nvidia cares more about performance than dominating the market?
    This is getting off track, so all I'm going to say about this is that there are limitations to what they can do without upending the CUDA programming model and CUDA compatibility. Within those constraints, I think they build the fastest hardware & software systems they think they can get enough people to buy.

    For a different perspective on how Nvidia sees AI hardware, check out their purpose-built DLA engines. Also, those are inference-oriented, so that's another big difference.

    Originally posted by ultimA View Post
    So stop mentioning asses and maybe pay closer attention to what you are answering to.
    You take yourself too seriously. Maybe stop thinking/acting like you're the only one here who knows anything. If you stopped being so defensive, maybe you might learn a thing or two.

    Leave a comment:


  • ultimA
    replied
    Originally posted by coder View Post
    That's just hilarious. You should walk up to their engineers and try saying that. I'd recommend wearing an extra cushion on your ass, if you ever do.
    I said, instead they are instead interested in being the most wide-spread and outperforming others. Are you saying Nvidia cares more about performance than dominating the market? And before you say it, no, one does not automatically imply the other. Also, no, this does not mean their GPUs don't need to be faster than competitors. They ARE interested in outperforming them, and I wrote exactly that (I guess you missed that part), but outperforming others and being the fastest possible are not the same things. So stop mentioning asses and maybe pay closer attention to what you are answering to.
    Last edited by ultimA; 06 January 2025, 05:32 AM.

    Leave a comment:


  • coder
    replied
    Originally posted by ultimA View Post
    You are forgetting that Nvidia is not interested in providing the highest performance.
    That's just hilarious. You should walk up to their engineers and try saying that. I'd recommend wearing an extra cushion on your ass, if you ever do.

    Leave a comment:


  • coder
    replied
    Originally posted by ultimA View Post
    Intel itself of course could contribute to open-source efforts, but there's no technical or commercial reason to support making it cross-vendor for a data-center-only solution.
    It's weird how you seem to be fixated on the cross-vendor aspect, when that was basically just a footnote and something which seemed to come for free. The main story here is Rusticle's support for SVM.

    Leave a comment:


  • ultimA
    replied
    Originally posted by coder View Post
    If that's your answer, then it was a bad decision by your logic. So, you're saying Nvidia made a mistake by enabling a shared memory model with NVLink because you'd rather the host try to orchestrate all the data movement in the system, instead of letting the nodes pull what they need from wherever they need it, on-the-fly.
    What? What I said is Nvidia recognized that using these features will make their product uncompetitive performance-wise. Which is my hole point. NVLink is needed to counter this somewhat, being lower latency and higher bandwidth wins back some of the drawbacks of doing automatic memory and transfer management. You are forgetting that Nvidia is not interested in providing the highest performance. The interest lies in outperforming others or at least being competitive, and being the most wide-spread.
    Last edited by ultimA; 06 January 2025, 04:52 AM.

    Leave a comment:


  • ultimA
    replied
    Originally posted by coder View Post
    I recall reading about someone getting SVM working on Nvidia GPUs without going through CUDA, but I don't remember the details. I wouldn't presume it's impossible for Rusticle to do, but perhaps either a lower priority or maybe takes more effort, due to the state of their Mesa driver.

    Intel supports multi-GPU Ponte Vecchio configurations. Indeed, this is what they have in the Aurora supercomputer.
    No open-source dev is going to be interested in developing for a purely datacenter solution. Intel itself of course could contribute to open-source efforts, but there's no technical or commercial reason to support making it cross-vendor for a data-center-only solution.
    Last edited by ultimA; 06 January 2025, 04:53 AM.

    Leave a comment:


  • coder
    replied
    Originally posted by ultimA View Post
    they did it to ease development on their platform,
    If that's your answer, then it was a bad decision by your logic. So, you're saying Nvidia made a mistake by enabling a shared memory model with NVLink because you'd rather the host try to orchestrate all the data movement in the system, instead of letting the nodes pull what they need from wherever they need it, on-the-fly.

    Leave a comment:


  • ultimA
    replied
    Originally posted by coder View Post
    Making it cache-coherent is not a natural consequence of having a unified address space. You basically just dodged the question. They went out of their way to make it cache-coherent, which cannot be explained without acknowledging that they intend programmers to use it like shared memory.
    I didn't dodge the question: they did it to ease development on their platform, and they needed NVLink to make it also performant. This goes for cache-coherency just like as it goes for a unified address space.

    Originally posted by coder View Post
    SVM abso-fucking-lutely makes sense, when you're sharing data between a CPU and iGPU!!
    Indeed it does. I said in this case there's no reason to make it cross-vendor, not that there's no reason for SVM at all in this case.

    Leave a comment:


  • Jabberwocky
    replied
    Originally posted by ultimA View Post
    It makes developing with OpenCL easier, but actively using it comes at a great performance cost. So demand isn't that great, which is why other stacks prioritized other more useful features. But it makes a nice marketing headline "Hey, we are the first to implement this thing that most people do not want to use."
    Meanwhile everyone running large scale multi models on consumer hardware relies on this to prevent their batches from crashing.

    This even helped me to figure out the limits of my cards running monolithic models one at a time in chains. It's faster offloading to RAM than to reload the entire thing from disk.

    Recovering from out of VRAM usually required manual intervention at best and a system restart at worst.

    The big difference here IMO is OpenCL vs CUDA, not SVM (Shared Virtual Memory) vs VM (Virtual Memory).

    Leave a comment:

Working...
X