Announcement

Collapse
No announcement yet.

Intel P-State Driver Begins Preparing For Hybrid Processors (Alder Lake)

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Intel P-State Driver Begins Preparing For Hybrid Processors (Alder Lake)

    Phoronix: Intel P-State Driver Begins Preparing For Hybrid Processors (Alder Lake)

    Intel's Linux preparations for Alder Lake -- and more broadly the concept of hybrid x86_64 CPUs with a mix of large Core and small Atom cores -- continues with the P-State CPU frequency scaling driver seeing new work to prepare for Intel's hybrid era...

    https://www.phoronix.com/scan.php?pa...eps-For-Hybrid

  • #2
    I wonder how much of power savings can be reached by simply down-clocking half the cores of a modern CPU, to emulate a hybrid one like this, while letting the option to use it in full clock when power/heat is not a problem, and full performance is needed.

    Comment


    • #3
      Originally posted by [email protected] View Post
      I wonder how much of power savings can be reached by simply down-clocking half the cores of a modern CPU, to emulate a hybrid one like this, while letting the option to use it in full clock when power/heat is not a problem, and full performance is needed.
      If I recall correctly, in the WearOS / Watch situation with ARM that uses big-little, many tasks just want a little background activity to do things like check your hear-rate constantly or constantly monitor something. This kind of thing is in a gray zone of "how do we do this efficiently". The issue a high-performance core will face is that it generally wants a minimum speed or minimum activity-level before you wake it up. The low-power little cores sometimes may not even be as energy-efficient, BUT they are far more suited for these tiny workloads where you don't want to wake up the big cores. If you're doing something crazy intense then certainly, the big cores are suited for the task performance and efficiency-wise.

      A quick example might be that your big core likes to run at 900Mhz and up. So something that theoretically would need a constant 300Mhz on your big core isn't as efficient since the core doesn't operate at that clockspeed. You'd put that load on a little core instead of running at 900Mhz constantly and wasting power or mixing idles with wake-ups to 900Hz which would also waste power.

      I'm sure someone on Phoronix knows far better than me, but this is how I understand one of the main problems big-little architecture is designed to solve.
      Last edited by Mitch; 24 May 2021, 05:34 PM. Reason: Clarity

      Comment


      • #4
        Originally posted by [email protected] View Post
        I wonder how much of power savings can be reached by simply down-clocking half the cores of a modern CPU, to emulate a hybrid one like this, while letting the option to use it in full clock when power/heat is not a problem, and full performance is needed.
        By modern CPU, I mean a conventional one, like a 8 core AMD 5800X or a Intel 11700k.

        Comment


        • #5
          Originally posted by [email protected] View Post
          I wonder how much of power savings can be reached by simply down-clocking half the cores of a modern CPU, to emulate a hybrid one like this, while letting the option to use it in full clock when power/heat is not a problem, and full performance is needed.
          You seem to have missed the point of Alder Lake. The goal here is maximum perf/watt-die area. The architecture of the Gracemont cores themselves is what allows this idea to succeed, and not their clockspeed. Think of the little cores providing about 40%(significant) the performance of the big cores but only taking 1/4th the die space.

          Comment


          • #6
            Originally posted by [email protected] View Post
            I wonder how much of power savings can be reached by simply down-clocking half the cores of a modern CPU, to emulate a hybrid one like this, while letting the option to use it in full clock when power/heat is not a problem, and full performance is needed.
            You kind of get this with AMD. They managed to stuff 8 cores into their 15W TDP parts with the 7nm node. They originally thought they would only be able to do 6 cores for Renoir. One thing you get with these high homogeneous core counts is the ability to switch over to another core if part of the die is overheating.

            Gaming is moving towards sustained use of 8 cores due to their inclusion in the Xbox Series X/S and PS5 consoles. But many users won't need much more than that. So in 3 years or so when AMD has at least 12 cores on mobile and 24-32 cores on mainstream desktop, the glut of unused cores can act as the "little cores", but work at their full potential (at a lower all-core turbo clock speed) for a highly multi-threaded workload.

            Originally posted by badger2k View Post

            You seem to have missed the point of Alder Lake. The goal here is maximum perf/watt-die area. The architecture of the Gracemont cores themselves is what allows this idea to succeed, and not their clockspeed. Think of the little cores providing about 40%(significant) the performance of the big cores but only taking 1/4th the die space.
            I don't know that we can expect 40%. My guess has been 33% the performance at 25% the die size. That is an improvement, but we'll see how well operating systems handle heterogeneous x86.

            (If 33% perf / 25% die size sounds bad, there is still the factor of power efficiency to consider. The performance per Watt should trounce the big cores.)

            AMD has managed to put 8 big cores in a small die area, and that will continue to shrink as they move on to 5nm (Zen 4), and 3nm (Zen 5/6?). It's possible that just adding more big cores is less of a hassle for everyone, particularly on desktop which does not have the power constraints of mobile.

            There is already a rumor that AMD will debut their own big/small implementation with Zen 5. Furthermore, they have patent for a different approach that could allow all the scheduling to be handled by the CPU, with no major changes needed in the OS. It almost seems like a better version of Bulldozer, with a group of small cores subordinate to big cores and sharing the same cache.
            Last edited by jaxa; 25 May 2021, 10:03 AM.

            Comment


            • #7
              Originally posted by badger2k View Post

              You seem to have missed the point of Alder Lake. The goal here is maximum perf/watt-die area. The architecture of the Gracemont cores themselves is what allows this idea to succeed, and not their clockspeed. Think of the little cores providing about 40%(significant) the performance of the big cores but only taking 1/4th the die space.
              I'm aware of their goals. My point is, since doing a big/little architecture is not a trivial thing, how much efficiency gains one can get via software, without resorting to the big/little way.

              I understand Intel is with its back against the wall and have to get creative, because their lithography is behind their competitors, but AMD had not such constrains.

              Comment


              • #8

                Originally posted by [email protected] View Post
                I wonder how much of power savings can be reached by simply down-clocking half the cores of a modern CPU, to emulate a hybrid one like this, while letting the option to use it in full clock when power/heat is not a problem, and full performance is needed.
                Originally posted by Mitch View Post

                If I recall correctly, in the WearOS / Watch situation with ARM that uses big-little, many tasks just want a little background activity to do things like check your hear-rate constantly or constantly monitor something. This kind of thing is in a gray zone of "how do we do this efficiently". The issue a high-performance core will face is that it generally wants a minimum speed or minimum activity-level before you wake it up. The low-power little cores sometimes may not even be as energy-efficient, BUT they are far more suited for these tiny workloads where you don't want to wake up the big cores. If you're doing something crazy intense then certainly, the big cores are suited for the task performance and efficiency-wise.

                A quick example might be that your big core likes to run at 900Mhz and up. So something that theoretically would need a constant 300Mhz on your big core isn't as efficient since the core doesn't operate at that clockspeed. You'd put that load on a little core instead of running at 900Mhz constantly and wasting power or mixing idles with wake-ups to 900Hz which would also waste power.

                I'm sure someone on Phoronix knows far better than me, but this is how I understand one of the main problems big-little architecture is designed to solve.
                One thing I feel I should point out is that the "little" cores aren't so little. They're rumored to have IPC similar to 10th gen and clockspeeds around 3ghz. So expect them to be fully utilized in high performance workloads as well.

                Comment

                Working...
                X