Originally posted by MadCatX
View Post
Announcement
Collapse
No announcement yet.
FUTEX2 Linux Patches Updated To Support Variable-Sized Futexes
Collapse
X
-
-
Originally posted by F.Ultra View Post
Well that is required due to the horrible nature of the WaitForMultipleObjects API and is needed to be done regardless of how you solve it (for usage in WINE that is), and as we have seen from the esync problems, lots of Windows apps/games are written in a way that if you don't handle it this way you end up with exhausted resources.
However that wasn't done in that implementation, and I'm not sure sure you can easily recycle condition variables or whatever they use.
I think with esync there may be an upper limit globally on how many you can use at the same time. For wait list entries, there is no upper limit at least if you implement them outside the kernel in user space. I think at least one version of patched futexes used (or still uses) an upper limit of 64 or 65 per thread since it wanted to use a fixed allocation inside the kernel and Windows has that limit. One of my use cases is about 100 events waited for by a single thread, so I wouldn't even be able to use the patched futexes the way they are meant to be used.Last edited by indepe; 04 June 2021, 04:55 PM.
Comment
-
Originally posted by indepe View Post
If the solution allows it, you can reduce the cost by recycling the wait list entries (thus avoiding the potential syscall and even the library call). Especially since the number of entries used is usually rather small. For permanent/long-term subscriptions, you don't even need to do that. (In my use cases, all are long term).
For anyone interested Collabara did a presentation of futex2 and the old solution here
Comment
-
Originally posted by indepe View PostI also like the idea behind FUTEX_SWAP, however it doesn't seem to be part of these patches.
But of course the maintainers had to play tough when it was first sent and now the Google dev doesn't want to bother anymore, since they probably use it internally at Google. zzz
- Likes 1
Comment
-
Originally posted by F.Ultra View Post
Yes but not for the WFMO API since the list can and will chang between each call and from what I've heard many games tend to really do crazy stuff here. Assuming long-term subscriptions leads to the esync problem of exhausted resources.
In the implementation that I am proposing you can recycle the wait list entries even if the number changes between each WFMO call. You can just keep the max number you used. There is no upper limit in that sense since they are just small amounts of memory each. It could be many millions.
As I said above, I have use cases with about 100 for a single thread, and also use cases where the number is usually smaller but actually unlimited, and there would be a problem if I were to have a fixed limit like Windows (and maybe futex2) with 64/65 per thread.
Comment
-
Originally posted by indepe View PostAs I said above, I have use cases with about 100 for a single thread, and also use cases where the number is usually smaller but actually unlimited, and there would be a problem if I were to have a fixed limit like Windows (and maybe futex2) with 64/65 per thread.
Also you have games that will creating thousands to millions of WaitForMultipleObjects that they may garbage collect in the distant future this is what has broken a lot of wine attempted solutions. The record for a working 64 bit game to have created under windows is 1 billion consuming over 4G of ram in locking data under windows. Yes that game had well over blown ram requirements due to crap locking.
If you do watch the youtube video on you will find the min rate you need your locking to cope with.
Windows game engines can be performing locking operations at a scary rate of over 42000 locks per second. Yes this is why a leak in this department goes to insane quickly. Yes framerate are like 240 frame per second is nothing. The reality is a 4 percent saving will allow more game logic operations to complete per frame this include doing extra optional checking for cheating.
Yes the video does note no frame rate gains. Games have a lot of optional things they can run. Some of those things not running either get you banned or even in single player cause the game to crash.
64-65 per thread will not cut it when you look at how current day windows game engines work.
Ps do note the upstreaming of futex2 is happening in the real-time tree first because the locking here is very latency sensitive form of locking that is required.Last edited by oiaohm; 04 June 2021, 09:53 PM.
Comment
-
Originally posted by oiaohm View PostWindows game engines can be performing locking operations at a scary rate of over 42000 locks per second.
As usual when it comes to multithreading, you only have a vague idea of what you are talking about, and it is an endless endeavor to try get on the same page with you. I currently don't have the time to spend on that sisyphus work.
Originally posted by oiaohm View Post64-65 per thread will not cut it when you look at how current day windows game engines work.Last edited by indepe; 04 June 2021, 10:28 PM.
Comment
-
Originally posted by indepe View PostThe number of 42,000 refers to futex operations, not locks, and I have already discussed that number.
The case documented with wine when they went away from using file based sync was over 30000 locks for a single thread and over 1500 wait multi.
indepe this is a intel tutorial go read "Table 5.7: Multi-threaded Main Thread Render Function". So you have the game engine mainloop creating more and more
WaitForMultipleObjects. So you have at least as many WaitForMultipleObjects calls as the frame rate in modern day multi threaded game engines. Of course this gets worse as this is used in more places.
This repo contains the DirectX Graphics samples that demonstrate how to build graphics intensive applications on Windows. - microsoft/DirectX-Graphics-Samples
Yes that intel example lines up with the Microsoft example.
WaitForMultipleObjects is massive used. Yes creating a new set of locks every frame and terminating all those locks at end of frame is what game engines are doing. Of course then some of those are leaked then garbage collected by the game engine code instead of fixing them properly.
Game engines are horrible.
Comment
-
Originally posted by oiaohm View Post
Please note we only have solid count on the futex syscalls. In fact that is not all the futex operations. The futex operations that are solved by atomic operations do not cause the syscall this are contended lock or lock create or lock delete that trigger syscall.
And futexes are not only used by locks. Also, for example, by semaphores, WaitForSingleObject, and WaitForMultipleObjects in that WINE test using futexes. And I guess by condition variables.
Originally posted by oiaohm View PostThe case documented with wine when they went away from using file based sync was over 30000 locks for a single thread and over 1500 wait multi.
Originally posted by oiaohm View Posthttps://software.intel.com/content/w...12-part-5.html
indepe this is a intel tutorial go read "Table 5.7: Multi-threaded Main Thread Render Function". So you have the game engine mainloop creating more and more
WaitForMultipleObjects. So you have at least as many WaitForMultipleObjects calls as the frame rate in modern day multi threaded game engines. Of course this gets worse as this is used in more places.
https://github.com/microsoft/DirectX...ading.cpp#L741
Yes that intel example lines up with the Microsoft example.
Sorry I currently don't have the time to follow the links or to give you more detailed answers.Last edited by indepe; 04 June 2021, 11:47 PM.
Comment
-
Originally posted by oiaohm View PostThe case documented with wine when they went away from using file based sync was over 30000 locks for a single thread and over 1500 wait multi.
So any problem would depend on contention and other additional factors.
Comment
Comment