I disagree with some of your assumptions from experience (such as the primary not going away, or burdening users with picking their highest-refreshing screen as primary is OK, etc.) so I think we've probably hit another dead-end. Let's resume the halt state. :-)
KDE Lands Wayland Fractional Scaling Support
Collapse
X
-
Originally posted by xfcemint View PostAh, but I said I want everything perfectly synchronized, as a design goal.
Perfectly synchronized is slow.
Originally posted by xfcemint View PostThat's a bad non-synchronized design.
Originally posted by xfcemint View PostFrom my perspective, perfect synchronization is just a design paradigm from one side, and some minor tweaks on the implementaion side.
Hardware is Asynchronous these days so does not operate with perfect synchronization logic.
Sho_ Heck, even in the clipboard case you want to avoid data copies
Originally posted by xfcemint View PostThat might have been true before 1995, due to low DRAM capacities. Today is 2022.
xfcemint think about the operations you need to perform to make a copy.
1) copy the data into new pages you had to allocate.
2) create pagetable entries..
All operations here consume CPU time.
Zerocopy
1)create pagetable entries set the data copy on write.
Zerocopy of course does not work over network but is a lot cheaper on CPU usage.
Data copies are cpu processing time expensive.
The reality I could give you a computer with terabytes of ram and program that only going to use 16 megs of ram and it can be having performance problems due to not limiting coping data around. xfcemint increasing ram does not address the fundamental reason why you wish to avoid data copies. You want to avoid datacopies in most cases to reduce CPU and GPU processing time consumed allocating memory. Lot of ways allocating memory to allocate memory is not doing anything useful for the user.
Comment
-
-
Originally posted by xfcemint View PostI would like to quickly sum-up my discussion about Wayland with user Sho_ , for those who do not want to read the entire discussion.
Sho was defending the current design of Wayland, including the concepts of no-global-state, out-of-band communications and non-synchronous events.
I immediately concurred on the no-global-state paradigm.
I disagreed with Sho on the concepts of out-of-band communications and non-synchronous events. It can easily be verified that those were most important points of mine, much earlier than the discussion with Sho started.
The entire beginning of the discussion is just one big misunderstanding about the exact meaning of out-of-band and synchronicity. It can be easily verified that my notions of those terms are the widely accepted ones.
Sho was at one moment trying to defend Wayland by an obvious attempt to re-define out-of-band communication. This attempt at re-definition was unsuccessful. Note that Sho is just repeating rationalizations (Wikipedia) of the Wayland designers: they also want to "win" this argument by re-defining "out-of-band" by some self-contradictory statement (also see Wikipedia). Of course, that doesn't work.
Then, Sho attempted claiming that synchronicity is not important, and that he can't find any use-cases for it. I supplied an example of a clear use-case.
Sho attempted claiming that my design has global state, which turned out to be a very shallow misinterpretation from his side.
Sho attempted claiming that my design cannot provide well-defined notions of some synchronous events, like frame IDs. As a response, I produced a simple and usable design that proves the contrary.
Finally, Sho attempted to not read my post and to thoroughly misinterpret my posts. I objected.
In the end, my opinion is that I have clearly won the argument, extremely conclusively. In conclusion, the design of Wayland is sub-par, and needs to be changed.
Comment
-
-
Originally posted by xfcemint View PostI disagreed with Sho on the concepts of out-of-band communications and non-synchronous events. It can easily be verified that those were most important points of mine, much earlier than the discussion with Sho started.
Out-of-Band/Adjunct protocols is different to not allowing "non-synchronous events"
Originally posted by xfcemint View PostThe entire beginning of the discussion is just one big misunderstanding about the exact meaning of out-of-band and synchronicity. It can be easily verified that my notions of those terms are the widely accepted ones.
DMABUF used by Wayland for graphics is exposing how the GPU behaves. Opengl under X11 is out of band and always was..
The hard reality is modern GPU are non-synchronous. You screen capture by the GPU you will have X amount of lag you will not get captured output back from the GPU at the same time its rendering it out on screen this is just the way it is. Yes it possible to be getting screen capture with negative lag compared to screen output.
Modern systems have multiple clock domains that all get out of sync all the time.
Originally posted by xfcemint View PostThen, Sho attempted claiming that synchronicity is not important, and that he can't find any use-cases for it. I supplied an example of a clear use-case.
Originally posted by xfcemint View PostThe user can CHANGE the primary display at any moment. The frame IDs do not have a fixed rate. The frame IDs don't have ANY rate. Each frame can have a different duration decided by the compositor.
Originally posted by xfcemint View PostThe compositor can AUTOMATICALLY switch the primary to another display, if necessary. The concept of "primary display" does not exist in the protocol, it is just an implementation detail of the compositor.
Multi core CPUs with dynamic clockspeeds are always in state of being out of sync between their multi time domains.
Proper synchronicity in modern day systems does turn out not to be important in fact most modern systems does not exist. The computer in front of you xfceminit I guess has a multi core modern x86 cpu with dynamic clockspeeds with a modern GPU. Does not matter the OS you load that hardware is only ever going to be near enough synchronicity that human does not notice.
Non-synchronous event means you don't have to worry between them if the clock domains are synced or not while you are processing them. More processing you can do non-synchronous the better mutli core CPU/GPU usage you can get.
Yes part of asynchronous and getting best performance is having some non-synchronous events that are not heavily time locked that you can schedule in the low processing times.
Synchronicity is not how the majority modern computer hardware works. Near enough synchronicity is how most computer hardware works. Yes means to schedule non important tasks that don't depend on being synchronized is important to get good hardware usage. Means to throw those tasks off to independent processes on CPU is important so they can be scheduled correctly.
Basically no "non-synchronous events" does not fly on modern hardware if you want to have good hardware usage.
Comment
-
-
Originally posted by xfcemint View PostExcellent, you I like so much. Love you much, kisses.
Phoronix: Wine's Wayland Driver Is Becoming Mature, May Aim For Upstreaming Early Next Year While now in the code freeze for Wine 8.0 as the next annual stable release of Wine for enjoying Windows games and applications on Linux, one of the features that didn't make it is the long in-development Wayland driver. Thankfully
This post of your does read purely like talking about Arcan class protocol. Not a protocol that rejects "non-synchronous events."
xfcemint no smart ass response this time. Sho and I have both most likely not been considering your failure to see that "non-synchronous events." are not optional. "non-synchronous events." is something desktop need. Not all desktop events need to be completed quickly.
Think about application asking compositor/desktop if X hotkey is available to be claimed vs time sensitive output. Application asking about if X hotkey available that not event that needs to be synchronized.
Desktop are a mix of "non-synchronous events" or "synchronous events".
Also hardware these days is being design more effective with non-synchronous than synchronous.
Base here xfcemint is Amdahl Law. Not allowing "non-synchronous events" how are you going to achieve high efficiency usage of multi core hardware. Synchronous. like it or not always end up stall points need bits that cannot be independently parallel processed and by Amdahl law this always end up under mining efficiency on multi core hardware..
Comment
-
-
Originally posted by xfcemint View Post
Love love love you you you. You smile me make.
xfcemint no smart ass response this time. Sho and I have both most likely not been considering your failure to see that "non-synchronous events." are not optional. "non-synchronous events." is something desktop need. Not all desktop events need to be completed quickly.Last edited by oiaohm; 19 December 2022, 09:34 PM.
Comment
-
-
Originally posted by xfcemint View PostYou are right that there can be some non-synchronous events, but this should be used with caution and very sparingly.
The primary use (that I can predict) for non-synchronous events are the various abort signals, which should go through a special high-priority "abort channel".
Generaly speaking, non-synchronous events should be avoided. The primary use of non-synchronous events should be for exceptional circumstances.
In that sense, what you have been saying is, generally speaking, incorrect, because all the major functionality can be implemented, without much trouble, just by synchronous events.
Amdahl Law is very important. Yes parallel programming modern GPU and cpus are parallel systems.
Synchronous events for parallel systems you have to start count those in Amdahl law as single threaded.
Asynchronous vs. synchronous programming: What are the similarities and differences? Learn about these two distinct approaches here.
synchronous= end up with single threaded effects.
Why is so much asynchronous/non-synchronous events were possible. Simple that is the path to perfect performance scaling with Amdahl law.
Yes you can implement everything as synchronous but then don't be surprised that it performance does not scale.
Its really good to take Amdahls Law and put in a graph program. The percentage of parallel is important factor. Being synchronous events lowers your percentage of parallel.
Of course something with 0 percent parallel will only have performance of a single thread in a single core.
Implementing synchronous events is simpler than asynchronous/non-synchronous events but you have to pay for synchronous events with a performance cost on parallel systems.
Lets look at some of the stuff in xdg-desktop-portal.
You are setting the background Wallpaper "org.freedesktop.portal.Wallpaper"
1)does this need to be synchronous no it does not
2)could you implement this synchronous yes could.
This is not a abort signal. This is not high priority event. A event that if it did not happen for second user is not going to care heck they might not even care if it takes almost a min to happen.
Is this worth lowering your percentage of parallel for? I would say myself absolutely no. Remember you lower percent of parallel on multi core systems you are capping your max performance. This is a item that is absolute better done as asynchronous/non-synchronous event. And its not the only one.
There are high priority events that need to be synced by some method then the
low priority events break into two types
1) Some events if the event never gets processed the user does not care but if they do get processed can be happy for what happened
2) events as long processed in reasonable time user does not care..
Neither of these care about being synchronous.
"freedesktop.portal.GlobalShortcuts"
This one is different everything as CLOCK_MONOTONIC time stamps on it for when keypresses happened. Why not synchronous-ed. You pressed a globalshort cut the application was in background so it was not running. Are you going to stall the system to wake process up to give the process the message the answer is hell no.
This one brings a different question do I need to be synchronous or do I just need to know what time event happened. CLOCK_MONOTONIC is not perfect there is drift that always being corrected. CLOCK_MONOTONIC comes a fixed overhead with Amdahls Law with the Linux kernel and yes CLOCK_MONOTONIC is massively optimized to make the cost as light as possible(its not zero). This here is really asynchronous using a uniform standard clock. Yes this does make things a little harder on application side having to check clocks to make sure everything happened in the right order. Also this is not what you call very network friendly.
CLOCK_MONOTONIC is a clock you cannot set on the Linux kernel it set when the system powers on and starts ticking from that point. This is not a solution useful across network. Why each system has their own CLOCK_MONOTONIC with unique clock starting times. CLOCK_MONOTONIC exists on Linux to get GPU/CPU and so on in the local system all on the same time page.
Comment
-
-
The discussion has dragged so long. I don't know which floor I shall quote for responding to xfcemint already.
There are stuff I dislike about Wayland. But a separation of sync and async channel is not part of them. For computing efficiency, the trend since 21st century is putting as many things into async mode as possible. With multi-core CPUs becoming the norm of every computing devices however low power, legacy systems that depend on synchrony are being lamented upon. In Phoronix here, one can see periodic news about how Python is dragging a little bit more of itself out of this technical debt, or how Linux kernel or userspace programs expand utilization of async_io.
Asynchronous processing or input-output operations are not for "high-priority". They are for anything that don't absolutely require synchrony. Because synchrony requires a global lock, or at least a bottleneck of single-thread. The use of async is not as wide spread as it should only because the dev tools haven't caught up from the single core computer legacy and thus making async programming harder or less familiar to write than traditional programming. That's it.
Comment
-
-
Originally posted by xfcemint View PostOK, I'll try to answer. I have already glanced over the text, and I think that your argument is not very good.
About your remark above: GPU timestamps (and other hardware timestamps) are irrelevant, since the compositor is the main authority on time. Time is virtualized, there are only frame IDs. Each frame can take multiple minutes, if that's required for debugging.
That's the answer to irrelevant remark no. 1 from you..
You need to know what order input events happened in. Frame IDs of is not going to work yes that 360hz monitor gives you a frame rate too low to correctly record input you need a faster clock than that.
Hardware timestamps are a little more important than you would like. You need to know when X image was displayed on screen relative to Y input. So that when a user clicks on Z target on screen that was displayed to them at that time they don't do A action because the window was moved after what was displayed on screen to them.
There is a need to manage the different latency issues with input.
Originally posted by xfcemint View PostCompositor is synchronized, not single-threaded. The easiest implementation of the compositor is a single-threaded one.
There would be some opportunities for multi-threading in the compositor, but that is largely irrelevant because the compositor doesn't require much CPU time. The clients are the ones that require CPU time and multi-threading.
So, the compositor can be single-threaded, and that would be just fine.
That's the answer to irrelevant remark no. 2 from you.
No this is your just straight up ignoring Amdahl Law and made a key mistake. As the number of cores increase so does the effect of single threaded sections. Notice you compositor does not use much CPU time. The worst part about a compositor at times it will be dealing using the GPU that is thousands of cores. So single threaded on CPU then not feeding information to GPU fast enough so stalling out GPU performance.
The reality is the number of cores in CPU and number of cores in GPU are growing at such a rate with the effect what were minor overheads are coming quite large.
The compositor does not work in isolation. This is also why the hardware clocks are important.
Originally posted by xfcemint View PostYep, the compositor is not going to wait for the background app. This doesn't affect other clients, because each client has a separate connection and a separate event queue.
That's the answer. Your question is not completely irrelevant, but it is a minor issue, easily solved.
That global hotkey one was a trap. You walked straight into. So you are not going to wait and make sure that client is not dead. Remember if the client is dead holding a global shortcut nothing else can use. Yes dbus sent message to application that it has a global short cut this is monitored and detect there is a problem.
Do you want the primary compositor having to perform watchdog tasks. Yes if the compositor is wasting it processing time on watchdog tasks it going to effect over all performance.
You just presumed this had a simple solution.
Originally posted by xfcemint View PostThe problem here is that you are mixing up event time and realtime timestamps. Those two can easily be separated. Event time is completely virtualized.
Realtime timestamps can be:
- virtualized: i.e. elapsed time since client connected, but time does not flow while paused for debugging.
- true time: These are problematic and should be avoided. They represent global state. But this has nothing to do with my protocol design, the problem of true time is unavoidable. One of the methods is to use some jitter on true time, to avoid global state. For example: the granularity of true time is approx.1 frame = 16 ms, with some jitter.
Completely irrelevant; That's your irrelevant remark no. 3.
Graphics frame rate is one of the slower items the compositor has to deal with. Slower items like the keyboard people commonly thing about. Fast items compositor has to deal with modern mice and drawing tablets. Of course mice and drawing tablets can all be operating on different clocks. I remember when 100hz mice and less was normal. Lots of things have got faster over the years on the input side.
Linux kernel has chosen CLOCK_MONOTONIC so it fast enough for all hardware input, output... Arcan has choosen a different route here arcan does not bind everything to the frame rate. Frame rate need to be relative to something else like it or not. Yes something else faster.
Visualized time is a problem works with CLOCK_MONOTONIC because when this clock stops so does all means to put in input also stop. Pausing the clock while debugging and queing up inputs can result in some very horrible outcomes. So you have to get from the hardware timestamps to what ever you are using.
xfcemint sorry zero right.Last edited by oiaohm; 20 December 2022, 12:57 AM.
Comment
-
Comment