Announcement

**Michael** · 21 December 2023, 05:43 AM

Originally posted by hansdegoede View Post

Michael, there actually is a solution for this problem which is intended exactly for getting memory details as non root. During boot a udev script runs which extracts this info from DMI and stores it in the udev database. If I run "udevadm info -e > udevdb" and then look for "DDR" in the udevdb file I find the following there:

Code:

P: /devices/virtual/dmi/id
...
E: MEMORY_DEVICE_1_DATA_WIDTH=64
E: MEMORY_DEVICE_1_SIZE=8589934592
E: MEMORY_DEVICE_1_FORM_FACTOR=DIMM
E: MEMORY_DEVICE_1_LOCATOR=DIMM 1
E: MEMORY_DEVICE_1_BANK_LOCATOR=P0 CHANNEL A
E: MEMORY_DEVICE_1_TYPE=DDR4
E: MEMORY_DEVICE_1_TYPE_DETAIL=Synchronous Unbuffered (Unregistered)
E: MEMORY_DEVICE_1_SPEED_MTS=3600
E: MEMORY_DEVICE_1_MANUFACTURER=Unknown
E: MEMORY_DEVICE_1_SERIAL_NUMBER=00000000
E: MEMORY_DEVICE_1_ASSET_TAG=Not Specified
E: MEMORY_DEVICE_1_PART_NUMBER=CMK16GX4M2Z3600C18
E: MEMORY_DEVICE_1_RANK=1
E: MEMORY_DEVICE_1_CONFIGURED_SPEED_MTS=3600
E: MEMORY_DEVICE_1_MINIMUM_VOLTAGE=1
E: MEMORY_DEVICE_1_MAXIMUM_VOLTAGE=1
E: MEMORY_DEVICE_1_CONFIGURED_VOLTAGE=1

I hope this helps to add memory info when running the benchmarks as non root.

That's wonderful to know, thanks! Finally! Will work on implementing that support.

**coder** · 21 December 2023, 07:36 AM

Originally posted by milkylainen View Post

100% this. People constantly underestimate dimensional complexity.
Instead of fixing their "p"-cores, they added the e-cores, which look nice on paper but is an absolute nightmare to optimize for.

Are you talking about OS scheduling or the compiler's code generation? Because, at the OS-level, we already had similar scheduling complexities introduced by SMT. And those are affecting the AMD CPU, as well.

In terms of compiler code generation, have you seen any benchmarks quantifying the impact of targeting Golden Cove vs. Gracemont, on either core? I'd be surprised if the performance varied by more than a couple %, either way.

Originally posted by milkylainen View Post

Being instruction set different really does not help either.

Not 100% sure what you're referring to, but all of Meteor Lake's cores support exactly the same instructions.

**Sweepi** · 21 December 2023, 09:51 AM

Feedback by 3dcenter.org (translated from German):

[...]
There is also a problem with this test, which can also be found in a similar form in many other tests of mobile devices:
The hardware testers often do not pay attention to the best possible comparability in terms of power limits and memory clocking, nor is there at least sufficient documentation of this data. In this specific case, an Intel processor with 45/55W power limits including LPDDR5/6400 and an AMD processor with 35/51W power limits including DDR5/5600 were pitted against each other.
Unfortunately, these specifications could not be found in the test report itself, but had to be laboriously gathered from other websites. Instead, the Phoronix test report only mentioned the official TDP specifications from AMD & Intel, which are even the same at 28 watts TDP - and thus sent the reader on the wrong track in the question of whether these processors really run with the same power limits in the specific notebooks.

[Note by me: This is of course offset by the fact that one can lookup all the Power consumption metrics in the database. A+-class Nerdporn galore imo. I personally still need to figure out how to add the power consumption to the" Comparison" graph.]

The result of the CPU tests looks even more impressive for AMD when you are informed that the AMD processor also achieved its clearly better performance result with a lower power limit. The small difference in the memory clock rates would also have been worth mentioning in the iGPU tests - and is even relevant to the results in view of the rather small difference in performance in these iGPU benchmarks.
At this point, there is certainly still a lot of room for improvement in terms of the quality of laptop reviews. The ideal - truly identical specifications for all test candidates - is undoubtedly difficult to achieve in the mobile field. But at least the differences should be documented, and this should also be mentioned when commenting on the results.
The only source where you can usually get complete specification information is Notebookcheck - which, however, in this specific case also failed to provide the memory clock rate of the Acer notebook.

Hope this does not sound too negative, Happy Holidays, and thanks for all the Linux Tests!

**yump** · 28 December 2023, 02:43 PM

coder SMT is less problematic for CPU scheduler, I think, because in highly parallel batch jobs (like allmodconfig kernel compile), all the threads run at the same speed until almost the very end. With hybrid architecture, there's the possibility that the critical-chain task gets scheduled on a slow E-core at any time during the entire job. That's why you see high run-to-run variance.

I think the only way to fix it is for the application to understand which of it's thread(s) are critical-chain and communicate that to the kernel with nice level.

**coder** · 28 December 2023, 03:34 PM

Originally posted by yump View Post

With hybrid architecture, there's the possibility that the critical-chain task gets scheduled on a slow E-core at any time during the entire job. That's why you see high run-to-run variance.

Well, the fair way to schedule jobs on a hybrid (or even just SMT) architecture should be to ensure jobs get equal execution time, weighted by the performance of the core they're running on (including whether they're sharing a P-core with a SMT sibling). In that case, there shouldn't be too much run-to-run variation from scheduling, alone.

Originally posted by yump View Post

I think the only way to fix it is for the application to understand which of it's thread(s) are critical-chain and communicate that to the kernel with nice level.

I've wondered if Ninja keeps stats on how long different jobs take to run. If it did, then it could compute the critical path and prioritize those jobs. Not necessarily via nice, but at least by starting them as soon as possible.

**yump** · 29 December 2023, 04:49 AM

Originally posted by coder View Post

Well, the fair way to schedule jobs on a hybrid (or even just SMT) architecture should be to ensure jobs get equal execution time, weighted by the performance of the core they're running on (including whether they're sharing a P-core with a SMT sibling). In that case, there shouldn't be too much run-to-run variation from scheduling, alone.

Huh. You can only ever guess at the performance of the core, but I wonder... what if the scheduler charged tasks based on the instructions retired PMC, instead of trying to weight runtime by capacity? Directly equalize progress of the task.

**coder** · 29 December 2023, 05:11 AM

Originally posted by yump View Post

Huh. You can only ever guess at the performance of the core,

The whole point of Intel's "ThreadDirector" is to collect metrics about each thread, to better inform scheduler decisions. I'd be surprised if you couldn't use the information it gathers to estimate how much useful work the thread got done.

Failing that, the OS scheduler can just peek at some of the core's performance counters and use those to arrive at a similar conclusion.

Originally posted by yump View Post

but I wonder... what if the scheduler charged tasks based on the instructions retired PMC, instead of trying to weight runtime by capacity?

Too simple, IMO, since it would advantage tasks that were more memory-bound. I think it's the right general idea, though.

**yump** · 01 January 2024, 01:26 AM

Originally posted by coder View Post

The whole point of Intel's "ThreadDirector" is to collect metrics about each thread, to better inform scheduler decisions. I'd be surprised if you couldn't use the information it gathers to estimate how much useful work the thread got done.

Yes, but with an extremely Intel flavor to it. Lots of features, lots of Intel® Technology. Theoretically optimizes unusual cases (100% occupancy by threads with dissimilar IPC characteristics) without addressing the core problem (some threads have more work to do than others, and the scheduler doesn't know which ones). Makes fine-grained distinctions that in practice collapse to a binary: thread needs to be fast (y/n). (See also ARM CPUs having 2-3 idle states when Intel has 7+.)

And then they go and implement it on Windows and it pessimizes a common case, running batch jobs in the background with the default "balanced" power plan, because there was previously no significant effect from Windows' behavior of de-prioritizing everything but the foreground GUI app.

Too simple, IMO, since it would advantage tasks that were more memory-bound. I think it's the right general idea, though.

On the one hand, yes. But on the other hand, all of the complexity of estimating CPU capacity at different frequencies and on different core types evaporates in a puff of logic. Plus, it offers a some isolation from noisy-neighbor effects in the cache and memory bus -- your % of the CPU gets the same work done no matter what the other threads are up to. The rustc project tracks instruction count in their performance CI tests because it's less noisy than time.

**SophTherapy** · 05 January 2024, 10:37 AM

Hello everybody,

I would like to apologize for my posts under nickname "sophisticles" and "hel88".

the thing is, I am very sick person. Schizophrenia with manic depression.
When I'm on my medication like now, I feel ashamed for the things that I do when not on medication.

For example, when I'm not using my therapy properly I get this crazy tendency to troll on linux forums. For that devious purpose I am using nicknames "sophisticles" and "hel88". under those nicknames I write crazy, insane things. when I am on regular therapy like now, I cannot believe the crap that I wrote under those 2 nicknames.

overall, I would like all of you to know that I don't really mean what I write under those 2 nicknames and also, I love linux, open source and gpl. and yes, microsoft sucks.

**intelfx** · 27 January 2024, 06:39 AM

Originally posted by yump View Post

<...> evaporates in a puff of logic <...>

I see you are a man of culture. *tips hat*

Announcement

Intel Core Ultra 7 155H Meteor Lake vs. AMD Ryzen 7 7840U On Linux In 300+ CPU Benchmarks

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment