Announcement

Collapse
No announcement yet.

PyTorch Foundation Formed By Meta, AMD, NVIDIA, & Others To Advance AI

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • coder
    replied
    Originally posted by brucethemoose View Post
    Yeah, the narrow RDNA2 bus (compared to Vega's very wide HBM2 bus) is part of it, and cache doesn't really compensate for that in ML.
    According to the Hot Chips presentations I've looked at, the trend in AI chips seems to be towards very large amounts of on-die SRAM (up to ~1 GB). Infinity Cache is absolutely in line with that concept, even if it currently tops out at just 128 MB.

    The reason it matters is that cutting-edge deep learning models are huge, with the consequence that you burn a lot more bandwidth reading the model weights than you do reading/writing the data propagating through it. So, the optimization people do is called "batching", where they read & apply a portion of the model to a batch of data samples (e.g. video frames or images), before fetching and applying the next part of the model.

    This optimization does work with infinity cache, to some degree, and would deliver a benefit since the Infinity Cache is connected through a higher-bandwidth link and involves less power to access. It's not as good as on-die SRAM, but still better than external DRAM.

    Originally posted by brucethemoose View Post
    Vega's desktop AI competition was Pascal, but RDNA2's theoretical ML performance compared to Ampere or even Turing is rather poor.
    For training, Vega was actually facing Volta, which had not only HBM2, but also Tensor cores. That was not an even match up. For inferencing, it was facing Pascal GPUs that had dp4a (dot product + accumulate of 4-element int8), which Vega also couldn't answer.

    So, AMD was never truly competitive with Nvidia, in AI. The CDNA chips are a different story, but still getting leap-frogged on AI and only truly competing at HPC.

    Leave a comment:


  • Jabberwocky
    replied
    Originally posted by Vistaus View Post

    Alphabet is confusing too as it reminds of the alphabet. And Google is confusing too as it reminds me of googly eye toys. And Azure is confusing too as it reminds me of the Côte d'Azur. And Microsoft is confusing too as it sounds like a small (micro) company making software. And AWS is confusing too as it can mean many things, like American Welding Society (yes, that exists!) for example.
    funny guy

    Leave a comment:


  • brucethemoose
    replied
    Originally posted by coder View Post
    Why do you say that? Vega did introduce packed arithmetic, but RDNA also has it. I guess the biggest problem with RDNA is that lacks the matrix extensions found in CDNA. You could also cite memory bandwidth, but Infinity Cache can be useful for that.
    Yeah, the narrow RDNA2 bus (compared to Vega's very wide HBM2 bus) is part of it, and cache doesn't really compensate for that in ML.

    But these architectures don't exist in a vacuum. Vega's desktop AI competition was Pascal, but RDNA2's theoretical ML performance compared to Ampere or even Turing is rather poor.


    AMD bifurcated their compute and gaming designs, and even if performance didn't technically regress, Nvidia continued to push desktop AI performance hard.
    Last edited by brucethemoose; 14 September 2022, 03:50 PM.

    Leave a comment:


  • Vistaus
    replied
    Originally posted by Jabberwocky View Post

    It's the other way around: Google (Alphabet, Inc.)

    People should use Meta/Facebook, Meta (Facebook), or Meta Platforms... as long as people just don't use "Meta" by itself. It's confusing by design.
    Alphabet is confusing too as it reminds of the alphabet. And Google is confusing too as it reminds me of googly eye toys. And Azure is confusing too as it reminds me of the Côte d'Azur. And Microsoft is confusing too as it sounds like a small (micro) company making software. And AWS is confusing too as it can mean many things, like American Welding Society (yes, that exists!) for example.

    Leave a comment:


  • Jabberwocky
    replied
    Originally posted by Setif View Post

    You missed Alphabet (Google), AWS (Amazon) and Azure (Microsoft)
    It's the other way around: Google (Alphabet, Inc.)

    People should use Meta/Facebook, Meta (Facebook), or Meta Platforms... as long as people just don't use "Meta" by itself. It's confusing by design.

    Leave a comment:


  • ET3D
    replied
    The upcoming RDNA 3 is said to have matrix extensions, so it seems that AMD does intend its consumer products to run AI. Hopefully this and being part of PyTorch means that AMD is trying to be serious about AI.

    Still, based on its track record, I don't expect AMD to provide anything easy to use with wide support in the near future. I hope that AMD proves me wrong.

    Leave a comment:


  • coder
    replied
    Originally posted by brucethemoose View Post
    Users have been asking for AMD PyTorch support for ages... especially in the Vega days, where the hardware was relatively well suited to it (unlike the RX 6000+ series).
    Why do you say that? Vega did introduce packed arithmetic, but RDNA also has it. I guess the biggest problem with RDNA is that lacks the matrix extensions found in CDNA. You could also cite memory bandwidth, but Infinity Cache can be useful for that.

    Leave a comment:


  • coder
    replied
    Originally posted by JellyBrain View Post
    Tensorflow is still very much alive (got new commit on GitHub 3 mins ago).
    Yep they are still kinda of a competing libraries.
    Since Google's TensorFlow couldn't kill off PyTorch, Google's interest is probably to try preventing their platforms and devices from being significantly disadvantaged by any developments in PyTorch.

    Leave a comment:


  • brucethemoose
    replied
    Users have been asking for AMD PyTorch support for ages... especially in the Vega days, where the hardware was relatively well suited to it (unlike the RX 6000+ series).

    I tried to the ROCM PyTorch branch working locally, and eventually just gave up.



    Hence I don't believe AMD is focused on bringing PyTorch upport to consumer hardware. Which is a mistake, as Nvidia quickly learned that students playing with CUDA on their laptops/desktops go on to work with CUDA as adults.​​​​​
    Last edited by brucethemoose; 12 September 2022, 06:52 PM.

    Leave a comment:


  • JellyBrain
    replied
    Originally posted by peterdk View Post
    I am familiar with Tensorflow, and did hear about Pytorch. Are these related, or 'competitors'? Is Pytorch now becoming the standard? Or is still just one of the options?
    Tensorflow is still very much alive (got new commit on GitHub 3 mins ago).
    Yep they are still kinda of a competing libraries.

    And it seems this foundation will have more focus on deep learning on the cloud, but still I am sure their upstream improvements will help all of us.
    Last edited by JellyBrain; 12 September 2022, 03:44 PM.

    Leave a comment:

Working...
X