Page 1 of 2 12 LastLast
Results 1 to 10 of 13

Thread: AMD Releases New Radeon Code: A-Sync DMA Engines

  1. #1
    Join Date
    Jan 2007
    Posts
    15,133

    Default AMD Releases New Radeon Code: A-Sync DMA Engines

    Phoronix: AMD Releases New Radeon Code: A-Sync DMA Engines

    A second update to the Radeon DRM driver has been released that will be pulled into the Linux 3.8 kernel. This second Direct Render Manager update for the Radeon kernel driver provides new code from AMD that was kept internally for months but is now permitted for open-sourcing...

    http://www.phoronix.com/vr.php?view=MTI0ODE

  2. #2
    Join Date
    Oct 2011
    Location
    Rural Alberta, Canada
    Posts
    1,030

    Default

    Pardon my ignorance, but what does this do exactly? Still, it is good stuff in any event.

  3. #3
    Join Date
    Nov 2007
    Posts
    1,024

    Default

    Quote Originally Posted by Hamish Wilson View Post
    Pardon my ignorance, but what does this do exactly? Still, it is good stuff in any event.
    A quick guess would be that they allow asynchronous DMA transfers.

    The specific consequence of which would be that a memory transfer to/from the GPU can be initiated without the driver needing to wait for it to complete, allowing reduced per-transfer latency and overall improved transfer bandwidth. That in turn should reduce draw call overhead, buffer update overhead, and texture upload overload, and hence improve performance for certain applications. For example, given the "GL is faster than D3D" benchmarks from Valve one can presume that their renderer is draw call bound, and hence this could offer a nice performance boost for Source games.

    All a quick guess, of course.

  4. #4
    Join Date
    Sep 2010
    Posts
    701

    Default

    Quote Originally Posted by elanthis View Post
    A quick guess would be that they allow asynchronous DMA transfers.

    The specific consequence of which would be that a memory transfer to/from the GPU can be initiated without the driver needing to wait for it to complete, allowing reduced per-transfer latency and overall improved transfer bandwidth. That in turn should reduce draw call overhead, buffer update overhead, and texture upload overload, and hence improve performance for certain applications. For example, given the "GL is faster than D3D" benchmarks from Valve one can presume that their renderer is draw call bound, and hence this could offer a nice performance boost for Source games.

    All a quick guess, of course.
    Yeah. I also would like to know what this "a" stands for. Thought that DMA is as "a" as you can get :P Or maybe its not about CPU/GPU chitchat but CPU DMA module to GPU DMA module talk?

  5. #5
    Join Date
    Oct 2007
    Posts
    51

    Default

    Quote Originally Posted by przemoli View Post
    Yeah. I also would like to know what this "a" stands for. Thought that DMA is as "a" as you can get :P Or maybe its not about CPU/GPU chitchat but CPU DMA module to GPU DMA module talk?
    Looking at the code the a stands for Asynchronous.

    Or in other words: they have a DMA hardware engine on the GPU. This engine can transfer memory in parallel to other stuff happening. Hence they call is asynchronous. Logical, right? ;-)

  6. #6
    Join Date
    Feb 2008
    Location
    Linuxland
    Posts
    5,187

    Default

    Does this alone have any performance impact? I recall Marek saying here ttm still does a sync after each transfer.

  7. #7
    Join Date
    May 2011
    Posts
    55

    Default

    WOW! Realy clean and understandable code with many comments! Thanks for all AMD team (for Alex special).
    Return to home and apply this pach for 3.7 ^_^ need testing OilRush fps.

  8. #8
    Join Date
    Oct 2007
    Posts
    51

    Default

    Quote Originally Posted by curaga View Post
    Does this alone have any performance impact? I recall Marek saying here ttm still does a sync after each transfer.
    It should have a very noticeable impact when buffer objects are moved and when VM mappings are changed.
    Especially the former case should have a rather big impact: the buffer move is performed by hardware, which is very fast at it. And if the buffer memory is not in coherent (cacheable) memory then the move is even faster.

  9. #9
    Join Date
    Oct 2008
    Location
    Germany
    Posts
    74

    Default

    The async DMA can do copy/moves independent of the shader engine. So while the shader part of the GPU is busy with the rendering we can still upload new data with the DMA at the same time.

    Additional to that it is quite a bit more efficient than the shader engine when you just want to copy some data from A to B, or just clear a specific region of memory (memcpy/memset).

    So using it should result in some very nice performance improvements for certain use cases, but what Alex has released is just kernel part of the implementation, and even that is missing the CS checker. It will probably just take some more time till mesa really picks that up.

    Cheers,
    Christian.

  10. #10
    Join Date
    Jul 2007
    Posts
    448

    Default But on the positive side...

    Quote Originally Posted by Deathsimple View Post
    So using it should result in some very nice performance improvements for certain use cases, but what Alex has released is just kernel part of the implementation, and even that is missing the CS checker. It will probably just take some more time till mesa really picks that up.
    This code is applicable to R600+ hardware! It's nice to see improvements being made for a very broad range of existing cards, rather than just the most recent generation or two.

    I wonder if there are any more such features waiting to be unlocked? (Apart from the UVD of course - we already know about that).

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •