Announcement

Collapse
No announcement yet.

Linux 5.16's New Cluster Scheduling Is Causing Regression, Further Hurting Alder Lake

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #41

    Originally posted by uxmkt View Post
    That's what the downvote button is for, to spot, tag and filter misleading opinions. We do have a downvote button, don't we? We don't? Michael? Michael! Damn, youtube is trying to get rid of the downvotes too!
    Its pretty well established the downvote button is for large corps to pay bots to use to hide opinions they dont want people to see. (originally countered by youtube by promoting downvotes identically to upvotes)
    Much like paid for reviews.

    Originally posted by jaxa View Post

    Not necessarily. For example, you can find cases where a 64-core Threadripper 3990X offers no benefit over a 32-core Threadripper 3970X, because software isn't using all of the cores. On the other hand, most benchmarking deals with one application running at a time, and you can expect applications to support more cores in the future.

    Meanwhile on the AMD side, AMD can start using more energy as the AM5 socket is designed to support a 170W TDP instead of 105W (peak power draw is higher). That should help them hit higher clock speeds and support more cores at the top. AMD is also rumored to be doing their own take on big.LITTLE with the newly announced, denser Zen 4c cores. If it serves no market, then why would AMD also be doing it? Maybe because it's actually a good idea, once all the scheduling/regression issues are ironed out. AMD should benefit from being a couple years late to the party.
    Yep, but I just dont get the user benefit of ecores.
    They are useless to the HPC crowd
    Desktop already idles cores that aren't needed, shuffling around load to get a good thermal balance.

    Whose need (other than Intels) do they serve?
    Last edited by mSparks; 15 November 2021, 12:28 PM.

    Comment


    • #42
      Originally posted by mSparks View Post
      Yep, but I just dont get the user benefit of ecores.
      They are useless to the HPC crowd
      Desktop already idles cores that aren't needed, shuffling around load to get a good thermal balance.

      Whose need (other than Intels) do they serve?
      2 e-cores have more performance than a single p-core on multithreading tasks that scale well, while consuming half the power and die space. The downside is that their single-threaded performance is terrible in comparison.

      That includes for HPC workloads (unless it's bottlenecked on a single thread?)

      The same type of workloads that tended to run well on ARM server chips will probably do well on the e-cores too. (Workloads that don't need a fast single thread, but can scale to high core counts)

      Right now Intel is bottlenecked by power use - they literally couldn't fit any more than 8 p-cores on a chip like the 12900k without power use causing the clocks to go way down during multi-threading use. Adding e-cores allows them to run those workloads with the same amount of power but with faster performance.

      I'm not personally convinced that the Intel implementation of e-cores was done particularly well, but the concept does seem sound overall. Expect future products to do it better.
      Last edited by smitty3268; 15 November 2021, 12:40 PM.

      Comment


      • #43
        Originally posted by ms178 View Post
        For those wondering, there was also a performance regression on AMD's EPYC spotted with this patchset […]
        But of course, that doesn't make any headlines because AMD is a holy entity.

        Comment


        • #44
          Originally posted by F.Ultra View Post

          support for the arch yes, but that does not mean that the scheduler does things right. You can e.g configure big.LITTLE in a way that low frequency from the governor means run on little and high freq from the governor means run on big so the hw does automatic switching, I'm not sure that there are support for running on both types of cores at the same time with correct scheduling yet.
          What do you know about Michael's recent article?

          Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite


          "The Linux kernel has long been catering to Arm's big.LITTLE designs and supporting features around energy aware scheduling and other software improvements on that front, including this work in 5.15 around proper scheduling of tasks if certain cores have reduced capabilities, but we haven't seen anything on the Intel front in the scheduler or power management areas."

          Comment


          • #45
            Originally posted by smitty3268 View Post

            2 e-cores have more performance than a single p-core on multithreading tasks that scale well, while consuming half the power and die space. The downside is that their single-threaded performance is terrible in comparison.
            If this was true, there would be no need for more than one or two p cores on desktop chips.

            pretty sure its not true or there wouldnt be a minimum of 6 on every chip they just released.

            Comment


            • #46
              Originally posted by Vistaus View Post

              But of course, that doesn't make any headlines because AMD is a holy entity.
              AFAIK, there was never a 'performance' regression from the cluster scheduling on the AMD side... IIRC from what came up on the kernel mailing list, it was a regression over spamming of dmesg with topology information due to not being supported.
              Michael Larabel
              https://www.michaellarabel.com/

              Comment


              • #47
                Originally posted by Michael View Post

                AFAIK, there was never a 'performance' regression from the cluster scheduling on the AMD side... IIRC from what came up on the kernel mailing list, it was a regression over spamming of dmesg with topology information due to not being supported.
                Ah, I see. I assumed that the person I quoted was talking about the performance. In that case, I apologize and you were right not to post an article about it.

                Comment


                • #48
                  Originally posted by mSparks View Post

                  If this was true, there would be no need for more than one or two p cores on desktop chips.

                  pretty sure its not true or there wouldnt be a minimum of 6 on every chip they just released.
                  The initial wave of Alder Lake desktop CPUs are marketed towards gamers. Putting in 6-8 P-cores ends the argument on single-threaded performance by giving most users enough of it. Game developers are moving towards using up to 8 cores and 16 threads because of the Xbox Series X/S and PS5 consoles, so Intel puts in 8 P-cores in the i7/i9. Future game engines could hypothetically stress all 8 big cores due to optimizing for 8 Zen 2 cores, but are not likely to require more than that, and could be just fine with fast 6+0 or 6+4. Since console generations last 7 years or so, it will be like this for most of the 2020s. Intel will argue that having the full 8+8 or Raptor Lake's 8+16 helps with various background tasks.

                  Intel is making a 2+8 ultra mobile die which will also be cut down to 1+4 (Lakefield all over again). Those will be at TDPs ranging from 5-20 Watts, and should be a good test of how many P-cores non-gamers need, and how much improvement there is to be gained over previous-generation Atom-only chips (0+4). I think a lot of people would be fine with a 2+8 U15 chip with 96 EUs of integrated graphics.

                  Comment


                  • #49
                    Originally posted by mSparks View Post
                    If this was true, there would be no need for more than one or two p cores on desktop chips.
                    Code:
                    tar | gzip | gpg | ssh
                    benefits from more than 2 but fewer than 6 high-performance cores. Any application that is programmed with threads for specific tasks instead of a thread pool has the same property.

                    Comment


                    • #50
                      Originally posted by jaxa View Post

                      The initial wave of Alder Lake desktop CPUs are marketed towards gamers. Putting in 6-8 P-cores ends the argument on single-threaded performance by giving most users enough of it.
                      If what you said was true, they could get the same single thread performance and nearly twice the multithread performance at a fraction of the power consumption with one p core and 21 ecores instead of 8 p cores and 8 ecores.

                      But what you said isnt true so the 12900k has 8 p cores, 8 ecores and still draws a whopping 250W at 100% utilisation.

                      It also seems unlikely any fractional lead in performance they have over 12 month old AMD will hold once AMD also moves to DDR5.

                      And I still dont get the point of all this messing around writing new software supporting something that draws so much more power to just about equal AMDs performance from last year.



                      Comment

                      Working...
                      X