Announcement

Collapse
No announcement yet.

Intel i9-12900K Alder Lake Linux Performance In Different P/E Core Configurations

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #81
    Originally posted by mdedetrich View Post

    I am getting the impression that the biggest issue appears to be Intel trying to provide a solution for something that from at least my OS studies back at uni is not really solvable, i.e. automagic scheduling on big little design that generally works better than the alternative. Big little designs work best when developers specifically code into their applications how to use cores, i.e. if you are virus scanner you would pretty much always want to use an E core, or for background tasks like checking for emails or index'ing for fuzzy file search.

    This is because knowing what should run on an e-core and what should run on a p-core is primarily a subjective thing, its when making something deliberately slower but to save power is acceptable which depends on the context of the application.
    Processor architectures are created by humans, as are operating system kernels and schedulers. Even an NP-complete problem is solvable with crude techniques.

    Optimus is supposed to know which applications should run on the powerful Nvidia GPU cores while lesser applications get punted off to the iGPU. Sounds magical but really isn't. What Nvidia implemented was a dictionary of applications with static declarations of what should run where. Worked well enough that it did not get in the way of users who knew nothing about Optimus, while the more savvy ones went around editing said dictionary for their individual preferences.

    Comment


    • #82
      Originally posted by Sonadow View Post

      Optimus is supposed to know which applications should run on the powerful Nvidia GPU cores while lesser applications get punted off to the iGPU. Sounds magical but really isn't. What Nvidia implemented was a dictionary of applications with static declarations of what should run where. Worked well enough that it did not get in the way of users who knew nothing about Optimus, while the more savvy ones went around editing said dictionary for their individual preferences.
      Yes but GPU's have a very different workload, they are much more specific than general purpose CPU's. Also Optimus is incredibly crude, it just sees how much load you have on the GPU and if the load is very high then it moves it to the discrete GPU.

      What we are dealing with big-little is an entirely different problem and its already been shown numerous times that the scheduler does close to jack shit when it comes to optimising anything. Steve from Gamers Nexus did extensive performance testing on Windows 10 (which has no scheduler) vs Windows 11 and the difference is almost completely negligible (talking about <1% if anything)

      This is what I meant what I said earlier about a "completely magical scheduler" not really existing, at least not one that makes enough of a difference.

      Comment


      • #83
        Originally posted by mdedetrich View Post

        Yes but GPU's have a very different workload, they are much more specific than general purpose CPU's. Also Optimus is incredibly crude, it just sees how much load you have on the GPU and if the load is very high then it moves it to the discrete GPU.

        What we are dealing with big-little is an entirely different problem and its already been shown numerous times that the scheduler does close to jack shit when it comes to optimising anything. Steve from Gamers Nexus did extensive performance testing on Windows 10 (which has no scheduler) vs Windows 11 and the difference is almost completely negligible (talking about <1% if anything)

        This is what I meant what I said earlier about a "completely magical scheduler" not really existing, at least not one that makes enough of a difference.
        And yet Alder Lake performs much better on Windows than it does in Linux.

        There is definitely something going on in the Windows scheduler, that much is for sure. Microsoft is no stranger to BIG.little; they dealt with it before in Windows RT on the Tegra 3 and they recently worked with it again on the SQ1 and SQ2 for the Surface Pro X. It will be hardly a surprise if the lack of performance difference for Alder Lake in Windows 10 and Windows 11 simply comes down to the fact that Microsoft has already worked on the scheduler to the point where the version in Windows 10 is punting jobs between the P cores and the E cores properly.

        Comment


        • #84
          Originally posted by mdedetrich View Post

          I am getting the impression that the biggest issue appears to be Intel trying to provide a solution for something that from at least my OS studies back at uni is not really solvable, i.e. automagic scheduling on big little design that generally works better than the alternative. Big little designs work best when developers specifically code into their applications how to use cores, i.e. if you are virus scanner you would pretty much always want to use an E core, or for background tasks like checking for emails or index'ing for fuzzy file search.

          This is because knowing what should run on an e-core and what should run on a p-core is primarily a subjective thing, its when making something deliberately slower but to save power is acceptable which depends on the context of the application.
          Intel is not trying to provide an "automagic" solution.

          Developers can still choose to directly schedule workloads onto each processor type (or ideally affinity hint the OS as to what the workload type is), see this:
          "However, it may be more optimal to run background worker threads on the Efficient-cores. The API references in the next section lists many of the functions available, ranging from those providing OS level guidance through weak affinity hits, such as SetThreadIdealProcessor() and SetThreadPriority(), through stronger control like SetThreadInformation() and SetThreadSelectedCPUSets(), to the strongest control of affinity using SetThreadAffinityMask()."

          The idea is that both the software and the hardware (Thread Director) are providing hints to the OS (Windows in this case) and the Windows scheduler matches up the workload to the right core, Intel isn't forcing any sort of automagic scheduling (again Thread Director only gives hints to the OS about the current state of the cores).

          See the diagram and description in this section: "Intel® Thread Director and Operating System Vendor (OSV) Optimizations for the Performance Hybrid Architecture"

          And again to reiterate, a developer can also still choose hard affinities if they want "through stronger control like SetThreadInformation() and SetThreadSelectedCPUSets(), to the strongest control of affinity using SetThreadAffinityMask()".

          All of the quotes are from this intel developer guide: https://www.intel.com/content/www/us...per-guide.html

          Comment


          • #85
            Originally posted by Sonadow View Post

            And yet Alder Lake performs much better on Windows than it does in Linux.

            There is definitely something going on in the Windows scheduler, that much is for sure. Microsoft is no stranger to BIG.little; they dealt with it before in Windows RT on the Tegra 3 and they recently worked with it again on the SQ1 and SQ2 for the Surface Pro X. It will be hardly a surprise if the lack of performance difference for Alder Lake in Windows 10 and Windows 11 simply comes down to the fact that Microsoft has already worked on the scheduler to the point where the version in Windows 10 is punting jobs between the P cores and the E cores properly.
            Linux doesn't currently have the code to tell e-cores and p-cores apart, so the scheduler treats them identically, with terrible results. That is easily fixed and will be fixed soon.

            Windows 11 largely uses the intel hardware scheduler apparently, but overrides in certain places by always making sure a graphical application that has focus gets put on a p-core, for example. I'd be surprised if their scheduler is actually even on par with linux, it's just that it's done and working while the linux code is completely turned off at the moment so it's doing something brain dead.

            Comment


            • #86
              Originally posted by smitty3268 View Post

              Windows 11 largely uses the intel hardware scheduler apparently, but overrides in certain places by always making sure a graphical application that has focus gets put on a p-core, for example.
              Which is the most basic principal that nobody seems to get.

              If an application is in focus, it means that the user intends to use it right there and then. There is no reason it should get shafted by assigning it to a lower-priority queue or an E core in Alder Lake's case. This has been the default behavior for Windows since Windows Vista, unlike Linux where every user application gets assigned a nice 0, regardless of the amount of resources and focus time it gets.

              Originally posted by smitty3268
              I'd be surprised if their scheduler is actually even on par with linux, it's just that it's done and working while the linux code is completely turned off at the moment so it's doing something brain dead.
              Right, a production-use scheduler that Intel and Microsoft worked on for Intel's own hardware on Windows and is currently in widespread deployment right now is inferior to the Linux scheduler, so much so that the inferior option actually works properly right now while the Intel developers have to keep the code disabled in the Linux scheduler for various reasons.

              Comment


              • #87
                Birdie is definitely not a Intel/NVIDIA fanboy, he just possess a normal need to defend a certain company and a certain product during 9 page comments thread. Just that - normal need to defend company/product, NOT a fanboy :lol: Anyway.

                AL is a decent CPU. The main issue is PL2, which is insane on i9. Totally unpractical and beyond optimal and sane working parameters of such a CPU. I guess they needed aggressive power mode to demonstrate more advantage over ZEN3 in the benchmarks. For me personally, none of the K series, except maybe 12600K, makes sense. Non-K and F SKUs will have all the benefits and will cost less.

                As for the hybrid architecture on the desktop itself - it has potential in the future, where E cores will grow in a count significantly. However, for Alder Lake, two more P cores instead 8 E cores would make more sense. It would me more universal architecture with less software magic needed, also, perf would be somewhat the same.

                Comment


                • #88
                  Originally posted by smitty3268 View Post

                  A much better "sweet spot" would be to just buy a 12700K. Mostly the same performance while much better power use and cost.
                  Not true, i have already debunked this.

                  Originally posted by Sonadow View Post
                  Fact remains that for all the benchmarks Michael has done about Linux having better performance over Windows, they simply don't carry forward to real-world computing. Till now nobody can provide a reasonable explanation as to why Windows boots, launches programs and generally respond to application inputs faster than Linux on the same hardware, especially on low-power hardware like Atoms.
                  Do you have any data to back this up? Because my experience is totally oposit to yours.

                  Comment


                  • #89
                    Originally posted by Sonadow View Post
                    Which is the most basic principal that nobody seems to get.
                    What about this principle do you believe that people don't get?

                    Originally posted by Sonadow View Post
                    If an application is in focus, it means that the user intends to use it right there and then. There is no reason it should get shafted by assigning it to a lower-priority queue or an E core in Alder Lake's case.
                    What if you are rendering a scene or encoding video in the background while working your report in Word or reading something in the browser? What if your workload consists of multiple processes that exchange data and both require a lot CPU power; like playing and streaming a video game?

                    Originally posted by Sonadow View Post
                    This has been the default behavior for Windows since Windows Vista, unlike Linux where every user application gets assigned a nice 0, regardless of the amount of resources and focus time it gets.
                    Niceness has nothing to do with this. The scheduling problem is about deciding how to allocate CPU resources on the fly. The decision logic must be both accurate, fast and account for the cost of its own calculations. This is not an easy problem and you often run into situations where you'd have to know the future in order to make a sane decision.

                    Originally posted by Sonadow View Post
                    Right, a production-use scheduler that Intel and Microsoft worked on for Intel's own hardware on Windows and is currently in widespread deployment right now is inferior to the Linux scheduler,
                    Are you implying that Linux CPU scheduler is not widely used in production?

                    Originally posted by Sonadow View Post
                    so much so that the inferior option actually works properly right now while the Intel developers have to keep the code disabled in the Linux scheduler for various reasons.
                    What code are you talking about here?

                    Comment


                    • #90
                      Originally posted by Anux View Post

                      Do you have any data to back this up? Because my experience is totally oposit to yours.
                      I have been running Debian with a custom-built kernel and Mesa on my Apollo Lake Atom laptop for the last three years.
                      Two weeks ago I threw Debian out and put Windows 11 on it. The difference in performance is immediately noticeable. Web browsers and other heavy applications like productivity suites no longer randomly stall for a minute when scrolling through >20 tabs or multiple pages in a docx file loaded with lots of images, photos and tables.

                      Comment

                      Working...
                      X