Announcement

Collapse
No announcement yet.

FEX-Emu 2404 Optimization Can Take Memcpy From 2-3 GB/s To 88 GB/s

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • FEX-Emu 2404 Optimization Can Take Memcpy From 2-3 GB/s To 88 GB/s

    Phoronix: FEX-Emu 2404 Optimization Can Take Memcpy From 2-3 GB/s To 88 GB/s

    FEX 2404 is now available for this open-source emulator project to allow running x86/x86_64 binaries on AArch64 (ARM 64-bit) LInux systems. FEX has been one of the leading avenues for opening up gaming on AArch64 Linux hardware, even making use of Wine / Proton (Steam Play) for enjoying Windows x86 games within AArch64 Linux confines...

    Phoronix, Linux Hardware Reviews, Linux hardware benchmarks, Linux server benchmarks, Linux benchmarking, Desktop Linux, Linux performance, Open Source graphics, Linux How To, Ubuntu benchmarks, Ubuntu hardware, Phoronix Test Suite

  • #2
    Loosely related but can't wait to see Snapdragon X Elite being a better linux laptop than Windows on arm lol

    Comment


    • #3
      It's really great to see this stuff progressing *before* the inevitable migration to ARM. Good to know my game library is safe!

      Comment


      • #4
        I was under the assumption that alot of multi threaded code will silently break if you dont emulate the x86 Memory model. Compilers will just take your carefully added this-happens-after information, see if the target runs x86 and then shrug while doing nothing else.

        Comment


        • #5
          is it still the case that this is unusable on asahi/m1?

          Comment


          • #6
            Originally posted by discordian View Post
            I was under the assumption that alot of multi threaded code will silently break if you dont emulate the x86 Memory model. Compilers will just take your carefully added this-happens-after information, see if the target runs x86 and then shrug while doing nothing else.
            The emulator in Windows on ARM adds memory barriers at the right places to emulate the x86 strong memory model when running x86 software. Of course it slows down the app, because each memory barrier means stopping working and waiting for CPU caches to sync. Apple doesn't have to do that, because that memory model is supported in their ARM CPU*. When you have the source code, you can compile it in Visual Studio in a special mode, that also adds these memory barriers. But since the code will be native ARM, it will still be much faster than the x86 emulation and you, as the developer, doesn't have to re-test and tune your code for the different memory model on ARM. Also it adds a feature you can load x86 DLLs to your EXE, e.g. software package which is not fully recompiled. An example is MS Office, where parts are from 3rd party companies, some not existing anymore and MS doesn't have the source code. They even fixed one security issue in one such tool by HEX editing like hackers .

            *) Strong memory model option is not specific to Apple. It's supported also by Fugaku (ARM supercomputer, but people write their code on x86 PCs and can't check the different memory model, so this way they just recompile it - without reworking the multithreading logic) and NVidia ARM "Denver" cores (their first try to own ARM cores, at the end used as big cores in tablets only and then stopped being developed). I want to say, nothing prohibits ARM core companies to include that feature, e.g. if they mean entering the PC world seriously (this time).

            EDIT: The reason NVidia ARM "Denver " cores had support for the strong memory model was because they wanted to be able to act also as an x86 CPU to be compatible with x86 PCs. More: https://en.wikipedia.org/wiki/Project_Denver
            Last edited by Ladis; 15 April 2024, 11:35 AM.

            Comment


            • #7
              I wonder if this memcpy change is based on jnettlett's improved memcpy that both fixes corruption and improves performance, yet the upstream maintainers apparently refuse to accept?
              jnettlet/cortex_a72_memcpy: Simple patched glibc memcpy implementation to fix issues with Cortex-A72 and device memory. (github.com)

              Comment


              • #8
                Originally posted by Beryesa View Post
                Loosely related but can't wait to see Snapdragon X Elite being a better linux laptop than Windows on arm lol
                Don't hold your breath while you wait. It's going to be an extremely locked down platform with no released documentation to speak of, and thus it'll be extremely hard to reverse engineer. It's being made pretty much in collaboration with Microsoft for usage with Windows. Just like how we still don't have full M1 drivers on Apple hardware.

                Comment


                • #9
                  Originally posted by Noitatsidem View Post
                  It's really great to see this stuff progressing *before* the inevitable migration to ARM. Good to know my game library is safe!
                  I'll be here waiting for everyone to inevitably come back to x86. You know where everything works.

                  Comment


                  • #10
                    Originally posted by Daktyl198 View Post

                    Don't hold your breath while you wait. It's going to be an extremely locked down platform with no released documentation to speak of, and thus it'll be extremely hard to reverse engineer. It's being made pretty much in collaboration with Microsoft for usage with Windows. Just like how we still don't have full M1 drivers on Apple hardware.
                    You can keep breathing usually, but Linaro works together with Qualcomm to upstream this, it will have a great Linux experience! Going as far as claiming day-one support

                    Comment

                    Working...
                    X