Windows 11 Better Than Linux Right Now For Intel Alder Lake Performance

avem replied

30 November 2021, 05:32 AM
Originally posted by yump View Post

It's just the "sy" field in top?

Might be, but I'm not so sure.
Leave a comment:
yump replied

29 November 2021, 11:33 PM
Originally posted by avem View Post

Oh, another pain of Linux: in Windows there's just a single process called "System" which is easy to assess in terms of CPU time it uses while in Linux you've got a ton of kernel threads and estimating their load is quite difficult if not impossible for the naked eye.

It's just the "sy" field in top?
Leave a comment:

avem replied

17 November 2021, 07:54 PM

Of course in top you can choose the root user and sort by time but summing up all those things is near impossible.

Code:

top - 04:53:18 up  2:10,  0 users,  load average: 0.04, 0.22, 0.40
Tasks: 316 total,   1 running, 315 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.5 us,  0.2 sy,  0.0 ni, 99.2 id,  0.0 wa,  0.0 hi,  0.0 si,  0.0 st
MiB Mem :  64242.0 total,  58883.4 free,   3145.6 used,   2213.0 buff/cache
MiB Swap:      0.0 total,      0.0 free,      0.0 used.  59810.8 avail Mem

    PID USER      PR  NI    VIRT    RES  %CPU  %MEM     TIME+ nTH  P COMMAND                            
   2359 root      20       24.2g  78.7m   7.9   0.1  14:06.53   2  3 /usr/libexec/Xorg -background none+
   2377 root     -51                      0.7         2:59.42   1  9 [irq/118-nvidia]                    
   1798 root       0 -20                              1:08.02   1 15 [kworker/u33:0-hci0]                
   1802 root       0 -20                              1:07.89   1  7 [kworker/u33:1-hci0]                
   1933 root      20      309.9m   6.3m         0.0   0:19.45   5  9 /usr/sbin/rngd -f -x pkcs11 -x nist
   1757 root      20                                  0:06.14   1  1 [nvidia-modeset/]                  
     11 root      20                                  0:02.34   1  1 [rcu_preempt]                      
  13224 root      20                                  0:01.83   1  3 [kworker/3:1-events]                
  11174 root      20                                  0:01.34   1  0 [kworker/0:1-events]                
  35883 root      20                                  0:01.28   1 12 [kworker/12:0-events]              
  14163 root      20                                  0:01.21   1  8 [kworker/8:0-events]                
  45805 root      20                                  0:01.12   1  9 [kworker/9:2-events]                
  31295 root      20                                  0:01.09   1 14 [kworker/14:1-events]              
   2379 root      20                                  0:01.06   1  1 [nv_queue]                          
  26279 root      20                                  0:00.98   1  5 [kworker/5:1-events]                
  40491 root      20                                  0:00.95   1 10 [kworker/10:0-events]              
   1945 root      20      386.2m  16.0m         0.0   0:00.82   6  6 /usr/libexec/udisks2/udisksd        
  24286 root      20                                  0:00.79   1  6 [kworker/6:0-events]                
      1 root      20      167.5m  15.8m         0.0   0:00.69   1  5 /sbin/init

Leave a comment:

avem replied

17 November 2021, 07:50 PM
Originally posted by yump View Post

Doesn't do anything about kernel threads, though. If I had one of these chips, I'd rather:

Code:

for i in {16..23}; do echo 0 | sudo tee /sys/devices/system/cpu/cpu${i}/online; done

At least until Intel gets their butts in gear.

Normally the kernel does the least amount of work, so that shouldn't be a big deal.

Oh, another pain of Linux: in Windows there's just a single process called "System" which is easy to assess in terms of CPU time it uses while in Linux you've got a ton of kernel threads and estimating their load is quite difficult if not impossible for the naked eye.
Leave a comment:
yump replied

15 November 2021, 07:00 PM
Originally posted by atomsymbol

Code:

taskset --cpu-list 0-15 command arguments...

Doesn't do anything about kernel threads, though. If I had one of these chips, I'd rather:

Code:

for i in {16..23}; do echo 0 | sudo tee /sys/devices/system/cpu/cpu${i}/online; done

At least until Intel gets their butts in gear.
Likes 1
Leave a comment:
yump replied

15 November 2021, 05:00 PM
Originally posted by agd5f View Post

I'm not sure what you are asking. Linux and windows take different approaches to CPU clock management in my understanding. Both OSes still give hints to the the hardware. CPPC (the underlying interface to control the CPU clocks) was designed for windows and seems to work well there. Linux seems to be more hands on while windows less so. I suspect windows gives target hints at a pretty coarse level and then lets the hardware go (here's my target performance, with that in mind, get it done as fast as possible) while Linux seems to constantly be setting new targets for performance (more work coming online, lets try a slightly faster target, now there's less work, let's try and slow things down). With the old pstate APCI interface, there were only 3 states, so even if the OS was constantly setting new targets, you'd just end up snapping to the nearest state. CPPC gives you a continuum of performance states, so every time you set a new hint, you could potentially end up walking through a long continuum of frequencies. I'm certainly not an expert in this area, just seems that way from what I've seen.

As stated by others on the thread, the desktop is not necessarily the main use case for Linux at this point. I suspect the current schedulers work well for embedded and server platforms. In those cases you might favor something more deterministic which it seems like the Linux schedulers strive for.

Cpufreq conservative works like that, but I think its hysteresis band is pretty wide by default. If you move the thresholds close together, you can make it act as an integral-only controller, which actually works decently well in some workloads. The schedutil and ondemand governors, on the other hand, are proportional with some heuristics bolted on. "We have X% load on this core, so set frequency such that X would be 80% assuming constant IPC."

The advantage of schedutil, in theory, is that it can preemptively change the frequency when a thread migrates between cores, without waiting for the hardware to figure it out.

Intel pstate in HWP mode (only available on Skylake and later, I think) lets hardware decide.

In general, my belief is that hardware can do a better job when you're trying to maximize performance constrained by power/temperature/current/voltage drop, but that software is better at minimizing the total task energy. It takes software to know whether the executing thread is a real-time game that wants 5 ms latency, a video decoder that has plenty of buffer and can get away with 200 ms, or a backup job that gets there when it gets there. Uclamp can do that, but it needs per-application tuning and unfortunately only Android seems to have the resources for to do that.
Leave a comment:
agd5f replied

15 November 2021, 12:14 PM
Originally posted by Linuxxx View Post

Honestly, comments like this make me wonder why there apparently isn't a proper communication channel inside AMD between different divisions?

You say that the decision-making logic should be left up to the hardware to decide, while schedutil proponents argue that the hardware can't possibly have a clue about OS run-time queues of all the different threads interacting with each other.

I'm not sure what you are asking. Linux and windows take different approaches to CPU clock management in my understanding. Both OSes still give hints to the the hardware. CPPC (the underlying interface to control the CPU clocks) was designed for windows and seems to work well there. Linux seems to be more hands on while windows less so. I suspect windows gives target hints at a pretty coarse level and then lets the hardware go (here's my target performance, with that in mind, get it done as fast as possible) while Linux seems to constantly be setting new targets for performance (more work coming online, lets try a slightly faster target, now there's less work, let's try and slow things down). With the old pstate APCI interface, there were only 3 states, so even if the OS was constantly setting new targets, you'd just end up snapping to the nearest state. CPPC gives you a continuum of performance states, so every time you set a new hint, you could potentially end up walking through a long continuum of frequencies. I'm certainly not an expert in this area, just seems that way from what I've seen.

As stated by others on the thread, the desktop is not necessarily the main use case for Linux at this point. I suspect the current schedulers work well for embedded and server platforms. In those cases you might favor something more deterministic which it seems like the Linux schedulers strive for.

Last edited by agd5f; 15 November 2021, 12:17 PM.
Likes 1
Leave a comment:
torsionbar28 replied

14 November 2021, 08:30 PM
Originally posted by HEL88 View Post

Intel has shown that the linux desktop is completely irrelevant.

Nailed it. Linux is optimized for commercially viable platforms. I.e. server and embedded devices. We will not see Alder Lake optimization until early 2022 at the soonest when Sapphire Rapids Xeon launches. Desktop users only get to reap the benefit if/when the desktop chip shares uarch with the server chip. If you're that committed to Linux on desktop use Ryzen instead (shares uarch with EPYC) or use the Xeon E series. Personally I run a Xeon E-2276G running Fedora for serious work, and a Ryzen 3600 for Linux gaming. Alder Lake is stuck in an odd spot at the moment.
Likes 1
Leave a comment:
lucrus replied

14 November 2021, 02:23 PM
Originally posted by fractalmess View Post

Insecure? Cant remember the last time i had a virus on my pc since 28 years. No one knows more about security than Microsoft, given their history and constant threat to Windows. Windows is built like an armored tank.

Sure. I can't remember the last time I had AIDS since I was born, so I can deduce AIDS does not exist anymore for anyone, right?

Besides, you can't tell anything about what MS Windows team actually knows about security, because their code is secret, their coding practices are secret and the only public information we have is the fact their product (Windows) is constantly a target for viruses and the like. Maybe there is a solid reason if virus writers mainly target Windows...
Likes 1
Leave a comment:
microcode replied

14 November 2021, 12:41 PM
Heterogenous scheduling is a problem like modern fighter jet control software: the thing is inherently more wacky and unstable, but can perform amazing feats if tuned correctly.
Leave a comment:

Announcement

Windows 11 Better Than Linux Right Now For Intel Alder Lake Performance

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: