AMD + Valve Working On New Linux CPU Performance Scaling Design

RedEyed replied

02 August 2021, 12:59 PM
Firefox went to updating the screen with less than 1 fps (disabling GPU acceleration helps though) and the rest of the desktop also feels sluggish.

Never had Firefox with less then 20 FPS (mb it depends on content). I have a lot of machines with different nvidia GPUs

The memory allocator also has trouble allocating larger chunks of VRAM when there still should be enough memory left

There is a well known problem that called "memory fragmentation".
Your experience is really interesting, but I don't believe that AMD has better memory allocator than Nvidia.
I think that you just used cudnn_benchmark=True which drastically increases memory usage in order to find the best algorithm, try to disable it and more likelly you can use batch size of 8 as on your AMD.

As a result, there are some networks that I can only train with a batch size of 3 when I trained them with a batch size of 8 on my previous Polaris GPU (both cards have 8 GB VRAM).

Using odd batch size drastically decreases performance, so my advice - never use odd batch size and use even sizes.
Leave a comment:
Volta replied

02 August 2021, 12:46 PM
Originally posted by intelfx View Post

About time.

So, once again something only happens in Linux when there is a commercial interest. Not before, not after.

Interesting. So, it seems there's no commercial interest in M$ Windows catching up to Linux.
Likes 1
Leave a comment:
Nille_kungen replied

02 August 2021, 11:58 AM
I hope this will help amd apu based laptops to have longer uptime on battery as well.
Likes 7
Leave a comment:
skeevy420 replied

02 August 2021, 11:34 AM
Originally posted by JPFSanders View Post

As per commercial interest, what is Paragon's benefit of mainlining their NTFS driver?

Their main source of income doesn't necessarily come from businesses needing both NTFS and Linux support. Mainlining it takes some of the burden off of them in regards to general maintenance and upkeep of the driver and allows their developers to focus more on the software solutions they can provide for their customers.
Likes 6
Leave a comment:
JPFSanders replied

02 August 2021, 11:04 AM
Originally posted by intelfx View Post

About time.

So, once again something only happens in Linux when there is a commercial interest. Not before, not after.

Three obvious points:

1) Developer time is not free, developers unless "hobbying" need to get paid.

2) And you (and everybody else) will get the benefit of this work for free.

3) The better things get on Linux land the more commercial interests that converge in Linux developments and improvements.

I see all of this as a positive.

As per commercial interest, what is Paragon's benefit of mainlining their NTFS driver?

Last edited by JPFSanders; 02 August 2021, 11:06 AM.
Likes 13
Leave a comment:
MadCatX replied

02 August 2021, 10:42 AM
Originally posted by aufkrawall View Post

avg fps and even 1% low percentile don't show bad influence on frame time variance well enough. You will see MangoHUD's frame time graph to look bad with schedutil/powersave when under certain load conditions, causing stutter, missed vblanks etc. While schedutil is way better than intel_pstate powersave (why is this crap the default setting...), it is still not good enough.

Fair enough, that sounds plausible and difficult to catch in benchmarks unless you look for that specifically. I guess that fixing this is what this AMD + Valve collab is about.
Likes 1
Leave a comment:
onlyLinuxLuvUBack replied

02 August 2021, 10:35 AM
Hello phoronix,
This page has overlapping numbers on some graphs( & can't see the max), maybe you could add a checkbox to clip numbers to 2 decimal places after the dot:

Cfs-vs-cacule Benchmarks - OpenBenchmarking.org

https://openbenchmarking.org/result/2107219-IB-CFSVSCACU49&hni=1&hlc=1&ftr=1&hgv=CacULE+5.4-r2%2C+full+tickless%2C+1000HZ%2C+low-latency+PREEMPT%2CUbuntu+default%2C+5.4.0-81-generic%2C+CFS%2C+idle+tickless%2C+250HZ&ppt=D&hni=1&ftr=1&ppt=D

OpenBenchmarking.org, Phoronix Test Suite, Linux benchmarking, automated benchmarking, benchmarking results, benchmarking repository, open source benchmarking, benchmarking test profiles
Leave a comment:
aufkrawall replied

02 August 2021, 10:03 AM
Originally posted by MadCatX View Post

Are they? The last round of benchmarks showed that schedutil has closed the gap on performance and performs just as well in most workloads. Unless the Windows' balanced performs better than Linux' performance, I wouldn't call schedutil bad.

avg fps and even 1% low percentile don't show bad influence on frame time variance well enough. You will see MangoHUD's frame time graph to look bad with schedutil/powersave when under certain load conditions, causing stutter, missed vblanks etc. While schedutil is way better than intel_pstate powersave (why is this crap the default setting...), it is still not good enough.
Likes 2
Leave a comment:
V1tol replied

02 August 2021, 09:55 AM
I am considering buying Steam Deck (of course if/when it arrives to my hole) just to sponsor AMD and Valve on their Linux efforts. I hope when many AMD+Radeon machines will be on Linux with top performance, maybe Intel will start improving their scheduler to benefit my notebook (i funded Intel) and maybe Novideo will opensource their driver (i funded them too). Just stupid dreams here
Likes 10
Leave a comment:
david-nk replied

02 August 2021, 09:28 AM
Originally posted by perpetually high View Post

When I had temporarily switched to the GTX 1080, that was no longer true. The desktop/OS lagged and lagged, was super sluggish. I'm pretty sure it was the GPU scheduler and nVidia had an inferior one than AMD on Linux, and absolutely no one can tell me different because the machine was exactly the same, just GPU swapped out, and all the necessary "were the right drivers/settings configured?"

This was my experience as well, I had to switch to a RTX GPU because ROCm just wasn't there yet, but the Nvidia scheduling and memory management is really bad. Firefox went to updating the screen with less than 1 fps (disabling GPU acceleration helps though) and the rest of the desktop also feels sluggish. The memory allocator also has trouble allocating larger chunks of VRAM when there still should be enough memory left. As a result, there are some networks that I can only train with a batch size of 3 when I trained them with a batch size of 8 on my previous Polaris GPU (both cards have 8 GB VRAM). I also noticed when an Nvidia GPU is under load, a fullscreen XPutImage/XShmPutImage takes about 1 second to complete instead of the normal 1 ms. That seems to be part of the problem.

But CPU frequency scheduling remains a problem. Even old Linux veterans are surprised by how much faster a project sometimes compiles when switching to the performance governor. All workloads with frequent sleeps or I/O pauses (compiling, games, video playback) are a nightmare to the current schedulers.
Likes 6
Leave a comment:

Announcement

AMD + Valve Working On New Linux CPU Performance Scaling Design

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: