No announcement yet.

Does Linux SMP perform at its optimum in Intel Core i7?

  • Filter
  • Time
  • Show
Clear All
new posts

  • Does Linux SMP perform at its optimum in Intel Core i7?

    This is a general technical question about Symmetric Multiprocessing in relation to the Intel Core i7. (Actually, it's more about Hyperthreading than i7.)

    My understanding is that Symmetric Multiprocessing means that the operating system treats all processors as identical, and therefore it can assign a new thread to any free processor.

    And that i7 appears to the OS as 8 processors, 0 thru 7.

    But from a performance perspective, the 8 processors are not identical, because each core has 1 execution engine plus the ability to store the state of 2 threads. This is, as far as I understand it, the essence of Intel's Hyperthreading. So each execution engine can quickly switch between 2 threads without bothering the OS, but only one thread at a time actually executes.

    Suppose, for example, all processors are idle and the OS assigns one thread to processor 0, then has a second thread to assign. Since all free processors are considered identical, the OS could assign the second thread to any free processor, say 1. The result would be that both threads are competing for the same execution engine, while the other 3 cores remain idle.

    Is my understanding of SMP correct?

    If so, does linux SMP take full advantage of Intel Hyperthreading (which existed also in earlier Intel processors)? That is, does linux SMP assign threads in such a way that it attempts to choose an idle execution engine?

    I don't know what algorithm SMP actually uses to choose the processor. For example, it could be the first free processor, or any free processor chosen at random.

    How about Windows? How does it handle the situation? (Sorry to mention Windows here, but it would be interesting to know the answer.)

    Thanks. Gerry

  • #2
    In the kernel config, there is an option called "SMT (Hyperthreading) scheduler support".
    SMT scheduler support improves the CPU scheduler's decision making when dealing with Intel Pentium 4 chips with HyperThreading at a cost of slightly increased overhead in some places.
    So the scheduler is definitely aware of hyperthreading, probably for the same reason you thought of.


    • #3
      I know this is kind of a different situation, but my brother has an old P4 HT that dualboots Linux and Windows Vista (both of which were optimized by me), and I see much better performance and usability on Linux. In Windows, games would often be jerky and stall up, but in Linux they ran smoothly. Also a few benchmarks resulted in higher scores on Linux.

      To sum that up, it appears to me that Linux handles hyperthreading better than Windows.


      • #4
        Core i7 isn't that special. Hyperthreading has been here for quite a while.

        On top of this lots and lots of benchmarks have been made on core i7's xeon relatives. They all show that the nehalem architecture runs fine under linux.

        I believe the CPU will choose what core executes what and not the OS. Hence the problem is non existent.


        • #5
          Originally posted by lordmozilla View Post
          I believe the CPU will choose what core executes what and not the OS. Hence the problem is non existent.
          Than how do you bind to a core?


          • #6
            I hear conflicting reports; it stands to reason that the CPU chooses a core/thread and not the operating system, for the very reason you mention.

            However, I've also heard that Intel contributed code recently to the kernel that allows the kernel to treat "cores" 0, 2, 4, and 6 differently to the odd-numbered "cores", in other words making sure that the real cores are busy before allocating threads to the hyperthreading "cores".

            Perhaps the reports aren't conflicting, and hyperthreading works better when the OS is also aware of the CPU's microarchitecture.


            • #7
              The OS chooses which physical core runs which threads and re-thinks those decisions at regular intervals.

              However, the choice between the two hyperthreaded threads on the same physical core is up to the CPU, because those switches are frequent and shouldn't involve the overhead of an OS call. Each physical core is assigned two threads by the OS and executes them as it sees fit.

              Be aware though that hyperthreading can lead to significant performance losses due to cache thrashing, depending on the application(s) you run. If you're going for high-performance calculations, do some benchmarks before starting a week-long calculation.


              • #8
                I have dual xeon e5410s in a box I built a year ago.
                This fall I build a core i7 860 matx board to go on the road.
                At the end of last year I built two systems each with a pair of e5520's.

                I do some light video encoding, and development on massively threaded scientific apps.

                Some things I've noticed:
                - I've never seen a case where the dual xeons outperform the i860 system. It's consistently 10-20% faster, especially on SMP.
                - The dual 5520s are faster than the i860 although I've not really benchmarked. I was tuning some threading recently that had IO problems and it seems the 5520's ran probably about 33% faster or so...hand wavey numbers, though.
                - Single thread the i860 rules the other 2 systems (not surprising).

                The hyperthreading scheduler is smart enough to take 4 active threads and schedule them on every other cpu. I've seem about the same behaviour with the 5520's also.