Announcement

Collapse
No announcement yet.

AMD Linux Graphics: The Latest Open-Source RadeonSI Driver Moves On To Smacking Catalyst

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • spstarr
    replied
    Originally posted by Michael_S View Post

    What distribution are you using? I have an R9 270, I don't mind riding the bleeding edge but only if installing it and staying up to date isn't too much work.

    Fedora, but I build using the RPM .specs with a few adjustments, compile LLVM with rpmbuild using flags --without crt --without lldb --without ocaml

    Leave a comment:


  • haagch
    replied
    Didn't think of that. Makes sense. Thanks for the explanation.

    Leave a comment:


  • bridgman
    replied
    It is 100% utilization, just not for an entire sample period.

    Typical driver operation is "100% CPU for a while, then 100% GPU for a while, then 100% CPU for a while...". You overlap as much as possible to try and avoid CPU processing getting in the way but that's a black art in its own right.

    Leave a comment:


  • haagch
    replied
    Originally posted by dungeon View Post
    That is not the way to differ singlethread vs multhreaded rendering... It does not mean that one or any core must go to 100% and render to be capped.
    What are you saying? I am CPU limited, even when no CPU core is used 100%?
    Sure, "synchronization/blocking limited" or "memory waiting limited" perhaps, but when saying CPU limited I think of raw processing speed being the limit and that is kinda by definition 100% utilization...

    Leave a comment:


  • haplo602
    replied
    Originally posted by dungeon View Post

    Then there might be also some r600 regression somewhere, in that case bisecting can help. I even tried to go down to 10.2.8 mesa and game still works there and still does not crash on Argus... well on a radeonsi.

    Well make sure that you have s3tc and if you use high textures then at least 1GB VRAM but also 1GB GTT.
    Found the issue. The Sb backend has crashes on the shaders. Once I set R600_DEBUG=nosb,llvm things work. For whatever reason, I made a mistake in switching backends during testing before. I'll also try the GL 4.0 path with llvm backend and see how things work.

    Leave a comment:


  • dungeon
    replied
    Yeah better to post some pictures CPU usage in Borderlands 2 with Catalyst, guess whis one is awfull capped and which one use threaded GL



    Last edited by dungeon; 03 September 2015, 10:16 PM.

    Leave a comment:


  • profoundWHALE
    replied
    Originally posted by dungeon View Post

    That is not the way to differ singlethread vs multhreaded rendering... It does not mean that one or any core must go to 100% and render to be capped.

    To check that the best is if you don't to move at all, just stand still look at the ground or very near into some wall, etc... in good threaded render *all* cores should be nearly the same with rare and not much spikes, while in singlethread render or not good threaded render all threads can be used of course but there are often high or low spikes of some of the cores.
    I remember those poorly coded (or very heavy on physics) rooms that I would have to look up at the ceiling in order to get through it. Like when you forgot that you dropped 2000 bottles of potions in some place in skyrim.

    Leave a comment:


  • dungeon
    replied
    Originally posted by haagch View Post
    Htop shows that csgo uses about 136-175% cpu with either no core being at 100% or the threads are shuffled to different cores faster than the htop sample period.
    That is not the way to differ singlethread vs multhreaded rendering... It does not mean that one or any core must go to 100% and render to be capped.

    To check that the best is if you don't to move at all, just stand still look at the ground or very near into some wall, etc... in good threaded render *all* cores should be nearly the same with rare and not much spikes, while in singlethread render or not good threaded render all threads can be used of course but there are often high or low spikes of some of the cores.
    Last edited by dungeon; 03 September 2015, 03:11 PM.

    Leave a comment:


  • haagch
    replied
    Originally posted by marek View Post

    If you enable "cpu,GPU-load,num-bytes-moved,buffer-wait-time,num-compilations" in the gallium HUD for CS:GO:
    - Is the CPU load lower than 100 divided by # of CPU cores? Is the GPU load lower than 90%? Which one is higher? (with respect to # of CPU cores)
    - Do any of the last three graphs show any activity when the performance is bad?
    Here is a video with the current state: https://www.youtube.com/watch?v=9dERLwSVJS4. Ignore the weird pauses and jumps, that's just the recording.

    Htop shows that csgo uses about 136-175% cpu with either no core being at 100% or the threads are shuffled to different cores faster than the htop sample period.
    num bytes moved and num compilations are mostly zero, so there's something that works well.
    buffer wait time is the only one that is relatively high...

    I have even experimented with making the buffer a lot smaller
    Code:
    diff --git a/src/gallium/winsys/radeon/drm/radeon_drm_cs.h b/src/gallium/winsys/radeon/drm/radeon_drm_cs.h
    index 6ceb8e9..6675d41 100644
    --- a/src/gallium/winsys/radeon/drm/radeon_drm_cs.h
    +++ b/src/gallium/winsys/radeon/drm/radeon_drm_cs.h
    @@ -30,7 +30,7 @@
     #include "radeon_drm_bo.h"
    
     struct radeon_cs_context {
    -    uint32_t                    buf[16 * 1024];
    +    uint32_t                    buf[1024];
    
         int                         fd;
         struct drm_radeon_cs        cs;
    just to see what would happen - not much did happen. Performance was about the same. Maybe the buffer wait time was a bit lower and it was a bit smoother, but if it was, it was in the range where I could have easily imagined it, so not really noticeable.

    Leave a comment:


  • dungeon
    replied
    Originally posted by haagch View Post
    No csgo benchmarks. :/

    Like the R7 370 my HD 7970M struggles to get ~60 fps in csgo. The unigine valley results looked so good here, I tried it too (on the "Extreme HD" preset) with latest mesa git.
    Code:
    FPS: 20.3
    Score: 848
    Min FPS: 12.1
    Max FPS: 37.9
    Wow. Is the R7 370 3x faster than the HD 7970M? Sure, old laptop vs new discrete, but it's still both pitcairn...
    OpenGL implementations often has big CPU overhead it is CPU capped in the main thread - watch your CPU *always*!!! So when you want to compare some benchmark results keep in mind singlethread IPC of CPUs used you compare. Your i7-3632qm is 25% slower then i5-6600K on that, which means there is more CPU power to feed up GPU. In some cases when CPU power is not enough also bigger the GPU means bigger CPU overhead so user with bigger GPU can see slower perf. So yes, your CPU IPC is somewhat like Kaveri APU has and perf should be comparible to that

    CS:GO and Borderlands 2 native OpenGL games are very good examples of driver overhead, both of those for performance more or less depends on threaded GL driver implementation which mesa drivers does not have.

    So either will someone implement threaded GL to fix this (most hickups you see in those games and performance issues should go away with this) partially in mesa or you can use Nine or different drivers or wait for complete solution called Vulkan

    Last edited by dungeon; 03 September 2015, 09:10 AM.

    Leave a comment:

Working...
X