Announcement

Collapse
No announcement yet.

FUTEX2 Linux Patches Updated To Support Variable-Sized Futexes

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by jacob View Post

    The problem is that there are few use cases for NTFS on Linux. Mainly it's just for dual boot machines and those are rare now. Which means merging in NTFS improvements is bound to remain a low priority.
    I am on DIscord group with focuses on running Linux on ASUS ROG laptops, I think they are about ~500 users and many of them are running dual boot, also many of my personal friends are running dual boot. Me too, it just easier to have dual boot then setup some virtualisation, especially when you need high performance/GPU access, unfortunately not everything still works well in Wine/Proton.
    About WSL, for many users the whole point of using Linux is to avoid Windows as much as possible, WSL is definitely the opposite...

    Comment


    • #12
      Originally posted by indepe View Post
      I checkd on the numbers given for speed improvement. These were 4% on benchmarks for operations that, as far as I can tell based on the limited information I could see, take about up to 850 ns (futex-wait perf test on my system). The claim seemed to be that some games execute 40,000 to 50,000 futex operations per second, and I don't know better than to assume it is these ~850ns operations. That would be an improvement of 1.7ms per second, which for a 200 fps game woudn't even be 1 fps.
      This was never about raw performance. The primary motivation is to have an equivalent of WaitForMultipleObjects in the Linux kernel. This is difficult to emulate efficiently without kernel support.

      Comment


      • #13
        Originally posted by MadCatX View Post
        This was never about raw performance. The primary motivation is to have an equivalent of WaitForMultipleObjects in the Linux kernel. This is difficult to emulate efficiently without kernel support.
        Are you sure that's not a myth?

        Comment


        • #14
          Originally posted by oiaohm View Post
          For different client side anti-cheat system that windows programs have its more than enough to be the difference between allow the player and ban the player.
          Ah, now it is the anti cheat system. Our last discussion was too long to start another one.

          Originally posted by oiaohm View Post
          This is unfortunate not true to look at as deprecated. WaitMultipleObjects from Windows API is not deprecated and common used in games. The WaitMultipleObjects function only comes into existence with windows XP/2003. There is a older signal system that is deprecated by WaitMultipleObjects function that waws also used to wait for multiple objects.

          How to solve the wait on multiple is own problem.
          I said "partially". You can't only wait you also have to setevent or pulseevent. pulseevent is deprecated (yet probably still used in existing software) because, as some say, it is "fundamentally flawed".

          Comment


          • #15
            Originally posted by indepe View Post
            Although kernel engineers seemed to acknowledge that improvements can be made in general and in theory, in comments to previous patch versions I didn't have the impression that they are anywhere close to accepting this series of patches, on the contrary.

            So I think 5.14 is completely unrealistic, the question would be more if ever.

            I checkd on the numbers given for speed improvement. These were 4% on benchmarks for operations that, as far as I can tell based on the limited information I could see, take about up to 850 ns (futex-wait perf test on my system). The claim seemed to be that some games execute 40,000 to 50,000 futex operations per second, and I don't know better than to assume it is these ~850ns operations. That would be an improvement of 1.7ms per second, which for a 200 fps game woudn't even be 1 fps.

            Of course this is somewhat speculating about the meaning of these numbers, but that is my point: no clear argument has been made.

            Keep in mind that this would be the improvement compared to the existing futex implementation. The existing futex implementation could still be used to make WINE *much* faster than WINE's current implementation using fsevents, esync or whatever.

            Also, if performance is the goal and if it would actually matter for these operations, I believe there are much better ways to get optimal performance if creating a new "system" would really be on the table. These would not have to create more burden on the kernel, but could remove burden from the kernel.

            Last but not least, this specific wait multiple futex concept here seems to be designed to support a (Windows-) API that is broken, partially deprecated and otherwise anything else than optimal or great, and lacks functionality that it should have.
            The 4% increase in performance is just on a very simple benchmark that compares futex() vs futex2() just to see that the new syscall would not decrease performance if it replaced futex(). It's not a benchmark for the use case of WaitForMultipleObjects().

            Comment


            • #16
              Originally posted by indepe View Post

              Are you sure that's not a myth?
              How do you this efficiently on Linux?
              Code:
              #define N_EVENTS 4UL
              
              HANDLE events[N_EVENTS];
              
              for (size_t idx = 0; idx < N_LOCKS; idx++)
                  locks = CreateEvent(NULL, FALSE, FALSE, NULL);
              
              /* ... */
              
              
              /* Main loop */
              while (TRUE) {
                  DWORD wait = WaitForMultipleObjects(N_LOCKS, locks, FALSE, 1000);
                  if (WAIT_OBJECT_0 <= wait && wait < WAIT_OBJECT_0 + N_EVENTS) {
                      handle_event(wait - WAIT_OBJECT_0);
                  } else if (wait == WAIT_TIMEOUT) {
                      /* Timeout */
                  } else {
                      fprintf(stderr, "Synchronization error");
                      abort();
                  }
              }

              Comment


              • #17
                I'm still waiting for FUTEX_SWAP, ffs! What's the hold up?

                Comment


                • #18
                  Originally posted by MadCatX View Post

                  How do you this efficiently on Linux?

                  [...]
                  In both cases, you'd want a user-space library that implements the so-called fast path in user space and the slow path with the help of kernel calls.

                  So the coding efficiency (the complexity of application code) depends on the library's API, which can be the same in both cases.

                  Note that the execution performance of a loop using the above API is likely much lower than with an API that allows you to create a permanent subscription where the event and the waiter are connected across iterations, such that wait lists don't have to be re-built each iteration. However perhaps you don't care about performance.
                  Last edited by indepe; 04 June 2021, 08:30 AM.

                  Comment


                  • #19
                    Originally posted by Weasel View Post
                    I'm still waiting for FUTEX_SWAP, ffs! What's the hold up?
                    I also like the idea behind FUTEX_SWAP, however it doesn't seem to be part of these patches.

                    Comment


                    • #20
                      Originally posted by F.Ultra View Post

                      The 4% increase in performance is just on a very simple benchmark that compares futex() vs futex2() just to see that the new syscall would not decrease performance if it replaced futex(). It's not a benchmark for the use case of WaitForMultipleObjects().
                      It's the only number that I saw, however I am not aware of everything that's going on there. I could easily imagine it may be typical for that use case as well, though.

                      Comment

                      Working...
                      X