Announcement

Collapse
No announcement yet.

Windows NT Sync Driver Proposed For The Linux Kernel - Better Wine Performance

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #81
    Originally posted by indepe View Post
    PulseEvent is "racy" for maybe two reasons:

    a) The API suggests being usable in cases where it isn't actually usable, and that results in "racy" applications which don't see that coming. I don't know if there are valid use cases. I could describe this more specifically if you want.

    b) I don't know if the esync and fsync implementations are "racy", but either way it doesn't mean that user space implementations are inherently racy. They are not. Meaning, not in addition to the API itself being already racy as per (a).
    I'm not talking about the flaws on Windows with it, although there are some. I'm talking about Wine's "userspace" implementations of it (well, wineserver is still userspace, but it requires IPC, that's why I put it in quotes).

    The point is that it does its entire operation atomically and you simply cannot do this with esync or fsync. No matter if it immediately resets it, there's still a race here because the other threads wake up in the meantime.

    Comment


    • #82
      Originally posted by indepe View Post

      First of all: I wouldn't be stubborn about it. Talking to stubborn people, there could be a separate process for that, if that's a way around the stubborness, and if that's what one wants to do.
      However I didn't say (in this context) that the wine-server should use shared memory. I just pointed out that it isn't user space being slow, it is other things.

      Regarding write access to shared memory, that's a good point, however there should be ways to deal with this, I say "should" because I haven't seen this implemented across processes. The first approach would be that for any windows app, only the events it actually sets would get a writable proxy, other things would be read-only in shared memory. So this is a point where it gets technically interesting, but the presentation (I need to see it again though) doesn't seem to offer any specifics about what has been tried, and why it didn't work out. I would want to know, also because others like myself might, down the road, try to implement something like that for other purposes, and if there are missing kernel features to implement cross-process synchronization well, then maybe kernel engineers should get a chance to implement those features for general use, outside of Wine. As far as I can tell so far, the patch request doesn't have that kind of information either, but I may have missed it so far.

      That's the point where real thread synchronization experts should get to see where exactly it didn't work out, so that they can weigh in (maybe Paul McKenney or Dmitry Vyukov, I don't know).
      Right. I'm sure there's a way to do it, and I'm personally fond of such things if it leads to improvements, rather than trying to band-aid it like what ntsync/esync/fsync do. But a lot of people disagree including Wine maintainers I suppose, since they don't want to make it "too complex" or whatever. It's the reason it's still single-threaded after all this time... sigh.

      It would also massively improve other parts that are slow due to wineserver, but not like they give a shit of micro optimizations. I do.

      Comment


      • #83
        Originally posted by oiaohm View Post
        Should not have mentioned my name Because you made me see this post and made me see that you got things wrong again..



        Wineserver can in fact deadlock for many reasons and you are right its never by itself. In fact your common deadlocks are multi process deadlocks the very problem that wineserver runs into. Wineserver being single threaded also means you don't have a watchdog thread to detect stalls/deadlocks.

        Wineserver is lacking this fancy stuff when stuff has gone wrong. So what should be a minor issue caused by some slightly baldy designed software turns into a total hang at times. Valve has noted this many times..

        Yes right on the only way it can hang that its a bug but you are not seeing that non self recovering hang because of these bugs because there is no deadlock/stall detection wineserver emulation of windows locking.


        Windows does in fact have deadlock dectection between multi threads/process. The deadlock detection of windows in theory would detect when wineserver has let parts using locks it controls to get into a deadlocked state.

        There is a missing wine feature here.

        Yes keeping wineserver simple means the cases of opps we are deadlocked so needing a watckdog to step in is reduced at the cost of lower performance.
        That deadlock has nothing to do with wineserver, and it's only a deadlock for a given process on Windows. Not even remotely an issue, just bad programming.

        Single threaded means it uses system similar to APCs/async handoffs. That is, when "waiting" for something (blocking I/O) it just goes to do the other queued tasks and is signalled when it is completed. It doesn't use an actual wait so it cannot deadlock. It only enters alertable sleep state when there's no tasks to perform.

        That can't deadlock because lack of tasks means it's simply waiting for a new task, not a cyclic wait.

        Comment


        • #84
          Originally posted by mrg666 View Post
          ​​​​​​What I see is that both Wine and Linux developers would prefer keeping all Windows garbage in the user space, ideally. But Windows apps are assuming they are talking to NT kernel, while wine server is relaying that to Linux kernel. It is hard to be an imposter, and try look like the real thing, just like Weasel is doing here. I am sure Linux kernel developers will make the right decision. And Wine developers will find the solution.
          What a pile of bullshit. You clearly didn't even read the patch by Figura. Keep clowning yourself.

          All locks are kernel space in every OS unless it's a spinning lock. "userspace" locks on Wine does NOT mean they don't enter the kernel, it means they don't go through wineserver. They're not actually userspace. For example, esync uses event fds, fsync uses futex2, those are kernel syscalls you dummy. Of course, IPC to the wineserver is also a kernel syscall.

          Why do they do it? Performance. It's fucking noted in the patch preview even with measured NUMBERS. Not surprised you didn't read it, you probably can't even understand it.

          It's not about being impostor or not. You simply cannot fucking do some things with esync/fsync APIs, they're not designed like all Windows APIs. And because they're kernel space (as in Linux kernel, not Windows), you can't work around their atomicity. You can do that with wineserver where you explicitly wake/signal threads yourself but that's SLOW.

          wineserver is 100% correct, but slow.
          esync is very incorrect, but fast.
          fsync is mostly correct, and the fastest.
          ntsync would be 100% correct, and very very fast (though not fastest).

          "Correct" means with respect to the implementation of the API that Windows uses. For instance PulseEvent needs to signal all waiting threads and then reset the event back to non-signalled in one go. Before any thread can do anything about it.

          You can't fucking do that without controlling the thread scheduling yourself aka the LINUX KERNEL or wineserver (which is slow as hell).

          Comment


          • #85
            Originally posted by indepe View Post

            I don't know what the futex API looked like in 1.x, however I would expect that would prefer it over NTsync.


            Futexes then appeared for the first time in version 2.5.7

            So... didn't exist. Also windows has WaitOnAddress... which is a futex. The issue is lots and lots of software rely on the older sync mechanism.

            Comment


            • #86
              Originally posted by Weasel View Post
              What a pile of bullshit. You clearly didn't even read the patch by Figura. Keep clowning yourself.

              All locks are kernel space in every OS unless it's a spinning lock. "userspace" locks on Wine does NOT mean they don't enter the kernel, it means they don't go through wineserver. They're not actually userspace. For example, esync uses event fds, fsync uses futex2, those are kernel syscalls you dummy. Of course, IPC to the wineserver is also a kernel syscall.

              Why do they do it? Performance. It's fucking noted in the patch preview even with measured NUMBERS. Not surprised you didn't read it, you probably can't even understand it.

              It's not about being impostor or not. You simply cannot fucking do some things with esync/fsync APIs, they're not designed like all Windows APIs. And because they're kernel space (as in Linux kernel, not Windows), you can't work around their atomicity. You can do that with wineserver where you explicitly wake/signal threads yourself but that's SLOW.

              wineserver is 100% correct, but slow.
              esync is very incorrect, but fast.
              fsync is mostly correct, and the fastest.
              ntsync would be 100% correct, and very very fast (though not fastest).

              "Correct" means with respect to the implementation of the API that Windows uses. For instance PulseEvent needs to signal all waiting threads and then reset the event back to non-signalled in one go. Before any thread can do anything about it.

              You can't fucking do that without controlling the thread scheduling yourself aka the LINUX KERNEL or wineserver (which is slow as hell).
              They will do whatever they decide to do. I read somewhere, it could be called winesync. Why don't you join them and help with implementation, you sound so passionate about it? Proving your knowledge to me is not really worth getting too excited about. I don't like your language and rage, and I am tired of you.

              Comment


              • #87
                Originally posted by Weasel View Post
                I'm not talking about the flaws on Windows with it, although there are some. I'm talking about Wine's "userspace" implementations of it (well, wineserver is still userspace, but it requires IPC, that's why I put it in quotes).

                The point is that it does its entire operation atomically and you simply cannot do this with esync or fsync. No matter if it immediately resets it, there's still a race here because the other threads wake up in the meantime.
                OK, so you are talking about what I labeled (b). It's been a while since I looked at details of esync, but from what I remember, esync works together with a shared memory representation of the event. My impression was that in case the wine implementation was racy, it could be fixed (in principle, not sure if there are other parts of the implementation in the way), that it could be fixed in principle by having a lock around the event which is then taken for almost the whole duration of PulseEvent.

                If you think this can't be done, why not?

                EDIT: It seems you are talking about "immediately resets" in the sense that PulseEvent first sets the event state to SET and then quickly to UNSET. However back then I argued it should just reset it to UNSET it in case it is SET, but not set it, not even for an "infinitesimal" short time. It should just release any waiters (one or all). It seems the new patch request (now) follows this principle. I'm not sure right now if this is a change to the older implementation.
                Last edited by indepe; 25 January 2024, 03:19 PM.

                Comment


                • #88
                  Originally posted by Weasel View Post
                  Right. I'm sure there's a way to do it, and I'm personally fond of such things if it leads to improvements, rather than trying to band-aid it like what ntsync/esync/fsync do. But a lot of people disagree including Wine maintainers I suppose, since they don't want to make it "too complex" or whatever. It's the reason it's still single-threaded after all this time... sigh.

                  It would also massively improve other parts that are slow due to wineserver, but not like they give a shit of micro optimizations. I do.
                  Yes, I might even try to work on a general Linux solution of such functionality, once it gets higher on my personal priority list. It seems we agree that such a solution should be possible, and that it could be used by Wine, or a variation of it (with or without new kernel features).

                  Comment


                  • #89
                    Originally posted by mrg666 View Post
                    They will do whatever they decide to do. I read somewhere, it could be called winesync. Why don't you join them and help with implementation, you sound so passionate about it? Proving your knowledge to me is not really worth getting too excited about. I don't like your language and rage, and I am tired of you.
                    Pay me and I will? Enough to quit my day dev job. You think I work for free or I have unlimited time?

                    Helping as a hobby or with ideas is completely different than spending hours upon hours to help implement it properly. Just to be clear I am 100% sure Figura is paid by Codeweavers to work on this thing so yeah.

                    Comment


                    • #90
                      Originally posted by indepe View Post

                      OK, so you are talking about what I labeled (b). It's been a while since I looked at details of esync, but from what I remember, esync works together with a shared memory representation of the event. My impression was that in case the wine implementation was racy, it could be fixed (in principle, not sure if there are other parts of the implementation in the way), that it could be fixed in principle by having a lock around the event which is then taken for almost the whole duration of PulseEvent.

                      If you think this can't be done, why not?

                      EDIT: It seems you are talking about "immediately resets" in the sense that PulseEvent first sets the event state to SET and then quickly to UNSET. However back then I argued it should just reset it to UNSET it in case it is SET, but not set it, not even for an "infinitesimal" short time. It should just release any waiters (one or all). It seems the new patch request (now) follows this principle. I'm not sure right now if this is a change to the older implementation.
                      Ok, holding a lock around the eventfd syscall sounds like it would be correct in behavior.

                      The only downside is, like I said, that requires writeable shared memory (to set the lock), and that can easily take down everything with just one misbehaving process.

                      Imagine the process crashes in the little time it holds the small lock (the one around the event, not the event itself). Now the lock is permanently held by something that doesn't exist anymore (crashed). What now?

                      Comment

                      Working...
                      X