Announcement

Collapse
No announcement yet.

FUTEX2 Linux Patches Updated To Support Variable-Sized Futexes

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Originally posted by MadCatX View Post
    Wait I minute. Are you describing a scenario where you have multiple *threads* waiting on *one* event? This is not what WaitForMultipleObjects is intended for. WaitForMultipleObjects allows one *thread* to wait for multiple *events* simultaneously and be woken up when any of the events it waits for is signaled.

    (Additionally, you can use a wait condition to implement the "many wait for one" scenario but that's besides the point).
    I was talking about one thread waiting for 3 events as an example. Perhaps you can''t believe this is so simple and you need to read it more carefully.

    Let's say you have 3 threads and 5 events, and each thread waits for any of the 5 events. Then you have 15 wait entries with 15 pointers to 3 semaphores (one semaphore per thread).

    Comment


    • #32
      Kernel RCU is by far the most beautiful algorithm. Still wish a sharable RCU between kernel and user-space.

      Comment


      • #33
        Originally posted by cyring View Post
        Kernel RCU is by far the most beautiful algorithm. Still wish a sharable RCU between kernel and user-space.
        Right, I think the problem is that a user thread not going into a quiescent state would be a problem for the kernel.

        Comment


        • #34
          Originally posted by indepe View Post

          I was talking about one thread waiting for 3 events as an example. Perhaps you can''t believe this is so simple and you need to read it more carefully.

          Let's say you have 3 threads and 5 events, and each thread waits for any of the 5 events. Then you have 15 wait entries with 15 pointers to 3 semaphores (one semaphore per thread).
          It sounds like you are describing something like this (https://web.archive.org/web/20080704011208/http://developers.sun.com/solaris/articles/waitfor_api.pdf). While this works and I stand corrected on my claim that you'd have to do some polling, their implementation has internal state in each call of WaitForMultipleObjects which is not great for uber latency sensitive code.

          Comment


          • #35
            Originally posted by MadCatX View Post
            It sounds like you are describing something like this (https://web.archive.org/web/20080704011208/http://developers.sun.com/solaris/articles/waitfor_api.pdf). While this works and I stand corrected on my claim that you'd have to do some polling, their implementation has internal state in each call of WaitForMultipleObjects which is not great for uber latency sensitive code.
            I have not seen such code before. They are using a mutex/cond_var pair for each handle, and I am not sure what the implications of that are. Although there may be simiarities, I do not have the impression that both solutions are similar enough to draw conclusions from one to the other.

            As I said, permanent subscritions that do not modify the wait lists for each loop iteration have an advantage. However this advantage cannot be offered through the Windows Event API, and I think it is not offered by the futex2 API either. Otherwise I don't know what exactly you could mean with an "internal state in each call of WaitForMultipleObjects" that might or might not affect latency compared to any other solution.

            Comment


            • #36
              Originally posted by indepe View Post

              It's the only number that I saw, however I am not aware of everything that's going on there. I could easily imagine it may be typical for that use case as well, though.
              Yeah it's the only number they have released so far in this version of the patchset, would be far more interesting if they released the gains that they see in the actual WINE implementation vs the older fsync/esync solution that uses futexes/eventfd. Should be far more than a single FPS considering the amount of work that they are putting in here.

              For esync especially I know that they have the problem of exhausting the file descriptor space since apparently many Windows applications/games are written in a very very bad way (which unfortunately WINE cannot fix), so that is most likely a dead end anyway.

              Comment


              • #37
                Originally posted by F.Ultra View Post
                Yeah it's the only number they have released so far in this version of the patchset, would be far more interesting if they released the gains that they see in the actual WINE implementation vs the older fsync/esync solution that uses futexes/eventfd. Should be far more than a single FPS considering the amount of work that they are putting in here.

                For esync especially I know that they have the problem of exhausting the file descriptor space since apparently many Windows applications/games are written in a very very bad way (which unfortunately WINE cannot fix), so that is most likely a dead end anyway.
                Yes, esync is (also) bound to be slower than the use of futexes (if well done, be it existing futexes or patched ones).

                As far as I understood, fsync is using patched futexes of an older variety, not the existing unpatched kernel futexes. Even if they have a version using existing futexes, I wouldn't be certain that their use of them is as optimized as it can be.

                Comment


                • #38
                  Originally posted by indepe View Post

                  Yes, esync is (also) bound to be slower than the use of futexes (if well done, be it existing futexes or patched ones).

                  As far as I understood, fsync is using patched futexes of an older variety, not the existing unpatched kernel futexes. Even if they have a version using existing futexes, I wouldn't be certain that their use of them is as optimized as it can be.
                  yes that is correct, fsync requires a custom kernel which is why the feature is disabled by default in proton. Using existing futex() we only have their claim that it was not working great so far, AFAIK no code of that have been released (but then I have not been looking that hard for it either).

                  Comment


                  • #39
                    Originally posted by indepe View Post
                    "internal state in each call of WaitForMultipleObjects" that might or might not affect latency compared to any other solution.
                    Their implementation builds and destroys the list of handles that are being waited on during each call to WaitForMultipleObjects. That is not very efficient since each malloc could possibly be a syscall. Another point to note is that they were apparently trying to replicate WinAPI whereas futex2 is probably aiming to integrate well with the already existing Linux APIs. However, you did manage to convince me that there are ways how to replicate WaitForMultipleObjects that are not horrible. Good to know...

                    Comment


                    • #40
                      Originally posted by MadCatX View Post
                      However, you did manage to convince me that there are ways how to replicate WaitForMultipleObjects that are not horrible. Good to know...
                      Welcome!

                      Comment

                      Working...
                      X