I am on thin ice here - let me hear if you think I am off track.

I have read several Phoronix reports about Linux operating system
benchmarks, and it seems to me you are measuring the wrong stuff.
I did not see anything that measured the operating system process
management skills. What I saw was benchmarks measuring compilers
(speed of compiled code) and single-thread disk throughput.

The job of an operating system:

1. manage competing processes
  • low latency for interactive processes needing user input
  • low wait time to resume process after IO complete
  • high efficiency: keep the CPU(s) busy
  • good scheduling fairness when CPU demand is over 100%

2. manage virtual memory and paging
  • fair and balanced allocation of real memory
  • stop thrashing process from harming overall performance
    (hard fault rate is allocated, balanced)

3. manage file systems
  • space allocation
  • reduce fragmentation
  • fair and balanced allocation of IO capacity

Measuring this stuff is admittedly hard. You need a suite of synthetic
applications that stress all the above resources, and you need to
measure their aggregate throughput over an extended time period.
Also a responsiveness benchmark is needed that measures interactive
responsiveness on a fully loaded system.

I would like to see how the Linux kernel measures against some of the
older Unix kernels, BSD, or SunOS.