Announcement

**FPScholten** · 20 July 2021, 02:53 AM

Originally posted by perpetually high View Post

Curious, if you could run that ctx clock on your Haswell laptop and see what you get. Just want to see the range of the context clocks on Haswell chips

Sure, here you go:

Code:

ctx_clock:
pts/ctx-clock-1.0.0
Test 1 of 1
Estimated Trial Run Count: 3
Estimated Time To Completion: 1 Minute [08:21 CEST]
Started Run 1 @ 08:21:47
Started Run 2 @ 08:21:53
Started Run 3 @ 08:21:58
Started Run 4 @ 08:22:02 *
Started Run 5 @ 08:22:07 *
Started Run 6 @ 08:22:12 *
Started Run 7 @ 08:22:17 *
Started Run 8 @ 08:22:22 *
Started Run 9 @ 08:22:27 *
Started Run 10 @ 08:22:32 *
Started Run 11 @ 08:22:37 *
Started Run 12 @ 08:22:42 *
Started Run 13 @ 08:22:47 *
Started Run 14 @ 08:22:52 *
Started Run 15 @ 08:22:57 *

Context Switch Time:
248
150
150
150
150
150
150
150
150
150
150
150
150
150
150

Average: 157 Clocks
Deviation: 16.16%
Samples: 15

Comparison to 3,672 OpenBenchmarking.org samples since 18 February 2019; median result: 259. Box plot of samples:
[ |*--------------------------------------------------*-------#########*#*##!*#* *]
This Result (70th Percentile): 157 ^
Intel Celeron J4115: 3201 ^ Intel Core i5-4690K: 1138 ^ Intel Core i5-1035G1: 68 ^
AMD Ryzen 5 2600: 256 ^
2 x Intel Xeon E5-2696 v3: 415 ^
2 x Intel Xeon E5-2620 v4: 476 ^

PROCESSOR: Intel Core i7-4710MQ @ 3.50GHz
Core Count: 4
Thread Count: 8
Extensions: SSE 4.2 + AVX2 + AVX + RDRAND + FSGSBASE
Cache Size: 6 MB
Microcode: 0x28
Core Family: Haswell
Scaling Driver: intel_cpufreq performance

Found out that context switching is dependent on several things, of which the scheduler is a big part. These results are obtained by running the benchmark with the CaCule-RDB scheduler. When doing the same benchmark using The default CFS scheduler the average result is about 100 clocks higher (about 250 clocks).

**perpetually high** · 20 July 2021, 03:11 AM

Originally posted by FPScholten View Post

Sure, here you go

Found out that context switching is dependent on several things, of which the scheduler is a big part. These results are obtained by running the benchmark with the CaCule-RDB scheduler. When doing the same benchmark using The default CFS scheduler the average result is about 100 clocks higher (about 250 clocks).

Really appreciate the follow up! Very interesting

I've tried the CaCule scheduler before, but I always end up going back to CFS, just a really solid overall scheduler that's dependable and stable. I might need to revisit that scheduler (was using it when it was called cachy). My system is in a good place right now where I can notice even the most subtle difference, so now would be the time. Thanks again for sharing.

**FPScholten** · 20 July 2021, 08:34 AM

Cachy was a nice concept, but was very bad at handling high demanding loads.
CaCule is a continuation of the principle and RDB adds to that. If you want low latency and responsiveness, use CaCule and RDB, if you want high troughput, use CFS.
For example, Kernel compilation with CaCule-RDB is faster, but Transcoding video/audio is faster with CFS. Most gaming profits from CaCule-RDB, (higher framerates).
But depending on your usual workload, your mileage may very, so testing if it is beneficial for you is reccommended.
Make sure to read the installation guide and set the correct parameters.

**perpetually high** · 20 July 2021, 08:42 AM

Originally posted by FPScholten View Post

Cachy was a nice concept, but was very bad at handling high demanding loads.
CaCule is a continuation of the principle and RDB adds to that. If you want low latency and responsiveness, use CaCule and RDB, if you want high troughput, use CFS.
For example, Kernel compilation with CaCule-RDB is faster, but Transcoding video/audio is faster with CFS. Most gaming profits from CaCule-RDB, (higher framerates).
But depending on your usual workload, your mileage may very, so testing if it is beneficial for you is reccommended.
Make sure to read the installation guide and set the correct parameters.

You really sent me down a very nice rabbit hole, thanks again. I didn't realize the progression of cachy since I first started using it. It sounds freakin AWESOME now, and I love everything about what he, xanmod, and sirlujcan are doing.

Highly recommend this read by CacULE's creator Hamad Al Mirri for those interested:https://github.com/hamadmarri/cacule...discussions/37

Love this dude's passion. CFS is definitely the best scheduler, and the fact that he's added a supercharger on top of it, is really exciting. I'm going to make Full Tickless 1000Hz kernel with CacULE applied, 5.4 LTS kernel, and see how it all works. Eventually will use the "isolcpus=3 nohz_full=3 rcu_nocbs=3" stuff when I want to use the Full Tickless (this works super well for games like Quake 2, you run the game on a single CPU core, separated from all the other kernel/OS noise, and wow, ZERO latency or stutter

**perpetually high** · 20 July 2021, 08:45 AM

Originally posted by Linuxxx View Post

AFAIR it was from an advice written in a low-latency tuning guide for RHEL, where it said that disabling intel_pstate also disables the intel_idle driver, the real source & culprit for higher latency times.

Yup, you are correct. It might have been you that put me on to that PDF, or I came across it myself after you mentioned acpi_cpufreq performance.

Either way, highly recommend everyone reads this: https://access.redhat.com/sites/defa...rhel7-v2.1.pdf

I went through it a while ago, but visited it again in detail a few days ago and literally applied everything that made sense, incrementally, benchmarked, and everything is super SMOOTH, baby. I'm going to write a guide hopefully soon, and it will be epic for building the most incredible Linux kernel known to man.

Announcement

The Importance Of Thermald On Linux For Modern Intel Tiger Lake Laptops

Comment

Comment

Comment

Comment

Comment