Announcement

Collapse
No announcement yet.

Linux 6.7 Will Let You Enable/Disable 32-bit Programs Support At Boot-Time

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #21
    Originally posted by oiaohm View Post
    One of those things why X32 exists on Linux.


    The 32 bit native code is slower than the 64 bit native code on x86 cpus. Yes X32 was created for the cases that larger pointers caused performance issue.
    This is pure nonsense. X32 is 64-bit mode, with just a software-defined ABI where code doesn't use > 4 GB address space (including kernel memory allocations). That won't help at all with context switching from 32-bit compatibility mode to 64-bit mode. That's a CPU thing that is required to do.

    Also 32-bit native code is not slower than 64-bit code at all, lol. Well, not by default. Of course, if an application is written badly for 32-bit or is register starved, that's another thing. For drivers it likely does not matter. And those are probably going to be the biggest issue, imagine every Vulkan API call that simply gets translated fast to the GPU, still has to do that context switch overhead, twice (when calling native API and when returning).

    Comment


    • #22
      Originally posted by Weasel View Post
      Also 32-bit native code is not slower than 64-bit code at all, lol. Well, not by default. Of course, if an application is written badly for 32-bit or is register starved, that's another thing. For drivers it likely does not matter. And those are probably going to be the biggest issue, imagine every Vulkan API call that simply gets translated fast to the GPU, still has to do that context switch overhead, twice (when calling native API and when returning).
      Here you attempting to pull my leg and play giggle bells.
      Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

      Its been bench-marked pure x86 32 bit code is slower on average than it 64 bit equal.


      Here a windows person from 2011. x86 64bit code is faster than x86 32 bit code does not matter if you like it or not Weasel those are the facts. There is price 64bit code has slightly higher ram and storage foot print due to using 64 bit pointers why X32 was made.


      I quoted this for the reason. The difference is not just registers but is register relegated.

      Notice how the calling convention with 32 bit x86 code is memory where calling convention with 64 bit side of the CPU is registers the lack of pic prologue is also related to the calling convention.

      The reality is a 32 bit x86 program is always acting like it register staved this is why 32 bit x86 program has a calling convention of memory. Using memory is slower than registers like it or not so the 32 bit x86 calling convention is slower that it 64 bit one even after the x86 CPU microcode applied optimizations.

      Like it or not there is a break even point where there is zero performance cost performing a context switch to 64 bit code from 32 bit code because you are losing due to the calling convention.

      Even with Vulkan there is still a lot of calls to be performed before you get to the GPU once you leave the applications control.

      Anyone who attempts to argue that 32 bit code is not slower than 64 bit code need to remember that what I just wrote only applies to platforms that x86 32 bit has limited registers so has resulted in using memory calling convention.

      It true that 32 bit code does not have to be slower than 64 bit code. Problem here the reason why 32 bit x86 code is always slower than 64 bit x86 code is architecture difference resulting in difference in calling convention and less registers to use. 32 bit risc-v vs 64 bit risc-v they don't have a different register count or a different calling convention so you don't see the difference.

      Context of what architecture is be talked about if 32 bit is faster than 64 bit is critical. Some architectures 32 bit is faster than 64 bit others like x86 64 bit is faster than 32 bit and others there is no difference. Wine content for majority of applications that have to be run are x86 platform that that the context we have to talk about the most.

      Comment


      • #23
        Originally posted by oiaohm View Post
        Here you attempting to pull my leg and play giggle bells.
        Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

        Its been bench-marked pure x86 32 bit code is slower on average than it 64 bit equal.
        https://www.willus.com/ccomp_benchmark2.shtml?p18
        Shit compiled code, likely with -fPIC for 32-bit. Nobody cares about this. Next.

        Originally posted by oiaohm View Post
        Here a windows person from 2011. x86 64bit code is faster than x86 32 bit code does not matter if you like it or not Weasel those are the facts. There is price 64bit code has slightly higher ram and storage foot print due to using 64 bit pointers why X32 was made.
        Show it on Windows apps dumbass, because those have a proper 32-bit compiler. You can benchmark them under Wine of course, since it executes the same code, that's what matters. Also, apps that don't specifically need huge throughput (which benefit from extra regs, especially in vectorization).

        Originally posted by oiaohm View Post
        I quoted this for the reason. The difference is not just registers but is register relegated.

        Notice how the calling convention with 32 bit x86 code is memory where calling convention with 64 bit side of the CPU is registers the lack of pic prologue is also related to the calling convention.

        The reality is a 32 bit x86 program is always acting like it register staved this is why 32 bit x86 program has a calling convention of memory. Using memory is slower than registers like it or not so the 32 bit x86 calling convention is slower that it 64 bit one even after the x86 CPU microcode applied optimizations.

        Like it or not there is a break even point where there is zero performance cost performing a context switch to 64 bit code from 32 bit code because you are losing due to the calling convention.

        Even with Vulkan there is still a lot of calls to be performed before you get to the GPU once you leave the applications control.
        You have no idea what you're talking about. Calling convention overhead is very minor, especially since in hot code paths it gets inlined in the first place by the compiler, so no function call at all. Context switching is simply on another order of magnitude or 2, it's not even close. You'd need like 500+ function calls to even out one context switch, and there's 2 of them (one when calling, one when returning). Also I mean entire function calls here, not just "The difference between 32-bit and 64-bit". 64-bit function calls, as you know, also have quite an overhead as well.

        Nobody cares what you think anyway, Wine devs already had benchmarks for this and it was quite bad that they had to optimize unix calls. However, they can't be optimized in wow64 mode, since they require full context switching, so not only they'll be slower, that's on top of the context switch.

        This was on the mailing list a year ago or so. I can't be arsed to look for it, so believe what you want and live in your bubble.
        Last edited by Weasel; 31 October 2023, 10:46 AM.

        Comment


        • #24
          [QUOTE=Weasel;n1418824]Show it on Windows apps dumbass, because those have a proper 32-bit compiler. /QUOTE]


          I did dumbass Weasel. Maybe would have paid you to read. Also I would say go and look up that mailing list turns out the context switch to 32 bit turns out can be optimized the same way the context switch to 16 bit is for win16 its more than 12 months ago you are refering to..
          Last edited by oiaohm; 31 October 2023, 11:48 PM.

          Comment


          • #25
            Originally posted by Weasel View Post
            This was on the mailing list a year ago or so. I can't be arsed to look for it, so believe what you want and live in your bubble.
            You need to back find that and read it again.


            Good question.
            A special syscall dispatcher is used for PE -> Unix transitions, to avoid the
            overhead of a full NT system call. This minimizes the performance impact of the
            new architecture, in particular for the OpenGL and Vulkan libraries.​
            What the context switch cost difference when you use the syscall dispatcher for PE->Unix transitions jumping from emulated Windows to Native Linux for 32 bit to 32bit, 32bit to 64 bit and 64 bit to 64 bit.. The answer is no difference. At this point you are doing a context switch no matter what.

            When the 32-bit Wine loader isn't found, 32-bit applications are started in
            the new experimental "Windows-like" WoW64 mode (where 32-bit code runs inside
            a 64-bit host process). This mode can be enabled by building with the
            '--enable-archs' configure option. This is still under development and not yet
            recommended for general use. Since in case of configuration errors it is
            possible for it to be triggered inadvertently, applications started in this
            mode print the warning "starting in experimental wow64 mode".​
            Wine already has experimental 64 bit loader that loads 32 bit windows applications without using Linux 32 bit syscalls.

            The reality is Weasel; you need to go back and read the debate. Win16 all 16 bit calls are replaced by 32 bit ones. Wow64 under wine is selective replacement with it only replacing the unix native libraries and selected windows libraries. So you still have 32 bit PE libraries. The reality here is as soon as you are calling out and using a native Linux library you have to context switch to keep the Linux/unix/mac os stuff hidden from the Windows application. Yes copy protection and other things likes getting really upset when they see Linux/unix/mac os stuff.

            All Linux native syscall usage by wine is done in the Linux native libraries wine uses. The reality you have already passed a context switch win a windows 32 bit application to use this stuff.

            Yes this now comes down to how fast can the code run because the cost of context switch does not always matter because particular places you may be having to pay that anyhow.

            Comment


            • #26
              Originally posted by oiaohm View Post
              You need to back find that and read it again.


              Good question.


              What the context switch cost difference when you use the syscall dispatcher for PE->Unix transitions jumping from emulated Windows to Native Linux for 32 bit to 32bit, 32bit to 64 bit and 64 bit to 64 bit.. The answer is no difference. At this point you are doing a context switch no matter what.
              No, that's purely from 32->32 or 64->64. I know how the code works, dumbass. You don't. The code does barely any register saves and then does a direct call.

              You can't do that on wow64. And besides, if full syscall was slow already, then wow64 will be even slower since there's a context switch on top (the Wine "syscall" is not really a syscall, it's slow because it has to save up shit, not because of doing a syscall, it still does a direct call btw, without wow64).

              Comment


              • #27
                Originally posted by Weasel View Post
                No, that's purely from 32->32 or 64->64. I know how the code works, dumbass. You don't. The code does barely any register saves and then does a direct call.

                You can't do that on wow64. And besides, if full syscall was slow already, then wow64 will be even slower since there's a context switch on top (the Wine "syscall" is not really a syscall, it's slow because it has to save up shit, not because of doing a syscall, it still does a direct call btw, without wow64).
                No Weasel you are out of date. There was a few changes in 8.15 wine.


                You are presuming that a context switch is horrible slow. For long time even wine developers presumed that. So you end up with a work around to go from windows to Linux mode and back that end up slower than biting the bullet and doing the context switch that had to be removed. Fun of hyper threading in CPU for you.

                Comment


                • #28
                  Originally posted by oiaohm View Post
                  No Weasel you are out of date. There was a few changes in 8.15 wine.


                  You are presuming that a context switch is horrible slow. For long time even wine developers presumed that. So you end up with a work around to go from windows to Linux mode and back that end up slower than biting the bullet and doing the context switch that had to be removed. Fun of hyper threading in CPU for you.
                  "Out of date" and guy links 2018 article.

                  Anyway, even your article shows how insanely slow they are. 1-2µs is insanely slow. A direct call (without wow64) is less than 10 nanoseconds, probably around 2-3. That's like 1000 times faster.

                  The return is even faster since it's predicted, probably almost no cost whatsoever. However, a context switch will have overhead on the "return" as well, another 1-2µs. So now it's 2000 times faster.

                  If context switches weren't so awfully slow, Google wouldn't have invested so much in FUTEX_SWAP, which afaik is still not upstreamed, ffs. Don't get me wrong, even FUTEX_SWAP is slow, it's about 100ns, which is still 50 times slower than direct calls, but not 1000!
                  Last edited by Weasel; 02 November 2023, 10:58 AM.

                  Comment

                  Working...
                  X