The Wayland Situation: Facts About X vs. Wayland
Written by Eric Griffith in Display Drivers on 7 June 2013. Page 1 of 1. 284 Comments

With the continued speculation and FUD about the future of Wayland at a time when Canonical is investing heavily into their own Mir Display Server alternative, Eric Griffith with input from Daniel Stone have written an article for Phoronix where they lay out all the facts. The "Wayland Situation" is explained with first going over the failings of X, the fixings of Wayland, common misconceptions about X and Wayland, and then a few other advantages to Wayland. For anyone interested in X/Wayland or the Linux desktop at a technical level, it's an article certainly worth reading!

Introduction

An overview of the problems, fixes and features in relation to X and Wayland. Written by Eric Griffith, with input by Daniel Stone-- Ericg and Daniels in the Phoronix Forums, respectively. Edited and fact-checked by Daniel Stone. To be posted by Michael Larabel on Phoronix.

Released as per Creative Commons version 3, with Attribution.

This document was pieced together by a volunteer contributor using presentations from Keith Packard, David Airlie, Daniel Stone, Kristian Høgsberg; as well as the X11, X12, and Wayland Wiki & Freedesktop.org pages, and by direct question-answer sessions with developers.

Since its first announcement many years ago there has been much information, misinformation, misconceptions and sheer FUD spread about Wayland-- the next-generation replacement for the X Window System. This overview hopes to clear up the "Wayland-Situation."

The Failings of X

Personally I believe that the benefits, and point, of Wayland are best understood in the point-of-view of X's faults and failings. So let's get started...

I) We've spent the last 10 years or so “fixing” the X server by wrapping it in more and more extensions and plugins. Problem with that though is...X only has minimal versioning support in its extension system.

            A) Versioning is handled per client, not per bind. So if your app supports one version of a given extension but your toolkit supports another, you can't predict which version of that extension you will get.

            B) Theoretical example: Rekonq supports Xinput 2.2. Kdelibs support Xinput 2.0, Flash plugin only supports Core X11...all of those things are gonna fight over what version of Input “Rekonq” supports and in the end you're gonna get one version to support everything...may not be the version that EVERYTHING supports though.

            C) If you're lucky, you will be given the lowest version supported and everything will hopefully work fine. If you're unlucky you will be given the highest version support and you will be sending useless, potentially error-ridden data between the client and the X server.

II) X has 4 input subsystems: Core X11, Xinput 1.0, Xinput 2.0 and Xinput 2.2. Xinput 1.0 has been scrapped, but the remaining three are more co-dependent than independent. As Daniel Stone put it “There's about three people who REALLY understand how the Input subsystems are all held together...and I really wish I wasn't one of them.”

III) Many years ago, someone had an idea “Mechanism, not policy.” What did that mean? It means that X has its own X-Specific drawing API, it is its own toolkit like GTK+ or Qt. It defined the low-level things, such as lines, wide-lines, arcs, circles, rudimentary fonts and other 'building block' pieces that are completely useless on their own. Note from Daniel: “Funny Story: Wide lines have to be pixel-perfect with the spec, which defines them to look ugly.”

IV) The X Server is huge and stupid. Before we (the community) began to scrap pieces of it and work around it, it was almost an entire OS.

            A) Don't believe me? X had its own print server. It got binned after someone added Xprint support to glxgears.

            B) It was a binary interpreter for ELF, COFF and a.out.

V) Compositing & Window Coherence. The developers taught X about compositing through the Composite Extension. For basic, eg: desktop, GL compositing its fine. If you want to use hardware overlays though (Videos) it becomes a complete disaster.

            A) Media Coherence. Whats Media Coherence? In its simplest terms... Your browser window? That's a window. Your flash player window on youtube? The flash player itself, displaying the video, is a sub-window. What keeps them in sync? Absolutely nothing. The events are handled separately and right now you just pray that they don't get processed too far apart. Which is why when you scroll on Youtube ,or other video sites with a video playing, sometimes everything tears and chunks.

VI) Fonts. The developers tried to teach the X server about fonts through the STSF extension. The idea was to store the font server-side and then give the clients enough information that they could figure out the proper layout of the font on their own. The information needed to do that though ende up being more than the actual size of the font. So it was decided to just shove the font down the wire and let clients deal with it themselves.

VII) Statelessness.... Or in other words: X Doesn't remember anything.

            A) “Please generate me a config file........Please actually USE this config file.” Why?? Eventually fixed by making the X-server only use a config file for overrides and making it know and have SANE defaults / auto-detection.

            B) Who's ever had problems with multiple monitors under Linux? OR ever had to re-setup all of your monitors after a reboot? All X's fault unless you store it in /etc/X11/xorg.conf.d/50-monitors.conf, then it DOES remember it...but you probably had to write that by hand.

            C) This will hopefully be fixed by the creation of libkscreen, a wrapper for xrandr that DOES remember which monitors go where, it remembers them by their EDID so that they are unique.

            D) For a long time, maybe even still, when you plug in an extra monitor under Linux your main monitor could have compositing, but your extra one could not. This MAY be fixed by RandR1.4 but this author could not find a solid yes or no to that point.

VIII) The window tree is a complete mess. Under X every input and text box was its own window which was parented by the window above it. Which is why no one understands the function that validates the window-tree. REAL (Eg: Not Core X11) Toolkits threw this out the the window a long time ago. No pun intended.

IX) Its a nitpick, but its also a valid concern... Under X11, the global pixel counter is 15bits. Which means, between all of your displays you can only have 32,768 pixels. At 100dpi that gives you 8.3 meters of display. Awesome... for comparison though. Windows XP has 96dpi. My phone has 320+dpi. Add in higher resolutions AND multiple displays...and things get dicey REALLY quickly.

X) Everything is a window to X, there's no different window types, its just “A window.”

            A) Your screensaver? Its a window that told X:
                        1) Put me above all other windows, at all times.
                        2) Make me fullscreen.
                        3) Give me all input.

            B) A pop up window? Its a window that told X:
            1) Put me RIGHT HERE.
            2) Give me all input.

            C) Problem? For one: they clash. Your screensaver won't activate while a pop-up window is up because they conflict.

            D) Your screensaver, and screenlocker, probably didn't hook into all the necessary libraries to understand media keys... the problem there is when you're working at home listening to some music, you get up to leave, close the lid and head out. Laptop's asleep, screensaver is the 'active' window. As soon as you open the lid up, your music kicks back in, blaring out of your speakers and its just easier for you to close the lid again and deal with it later rather than scramble to put in your password, open the media player and pause it, or hit mute.

            E) The developers tried to fix it. They specced out an extension, had the theory ready. But when it came time to implement it, they realized it would break the X Model too badly. This has been broken for 26yrs, and its going to STAY broken. Enjoy.

XI) “But Eric, if X11 is so terrible why not just make X12 rather than a whole new protocol?” They did, technically anyway: http://www.x.org/wiki/Development/X12

One big problem with keeping it under the “X” umbrella: Anyone who cares about X would have a say in a future version of it. By calling it “Wayland” they avoid that issue. No one cares. Its an unrelated project, they (the developers) can do what THEY want with their future display server, the people who care about X can go to make X12.

The Fixings Of Wayland

(numbered to match-up with the failings of X.)

I) The entire protocol is versioned. Every listener gets exactly the version they support, nothing more. No more randomness.

II) The input system in Wayland looks a lot like Xinput 2.2, minus all the legacy cruft and minus the Master/Slave relationship between inputs. Everything gets one virtual keyboard, one virtual mouse, and one non-virtual tablet interface. The nightmare called multitouch will finally be sorted out. Note from Daniel: As one of the authors of multitouch, I feel pretty qualified to say that it's shit.

III) Wayland HAS no drawing API to mess around with. Wayland wants buffers filled with pixels from clients and, aside from the security hooks to make sure clients aren't messing with eachother's buffers, it doesn't care how those pixels got there. Clients control what pixels those buffers hold, this way what gets displayed on screen is EXACTLY what the client wants.

IV) Wayland is minimal. There's no over-arching Psuedo-OS controlling your graphics display. There's no 26yr old API “Getting in the way.” Clients carry the brunt of the work, which is okay because clients don't have to maintain extreme-backwards-compatibility. Qt5 dropped Qt3 support. X still has to maintain things that were written 26yrs ago. And things from 26yrs ago, are getting in the way of fixing things TODAY.

            A) Note from Daniel: Wayland is also non-blocking, so your entire desktop doesn't stop rendering just because one client hangs or is going through an expensive operation. Only THAT client stops rendering.

V) Composition is mandatory under Wayland. That isn't to say that everything has to have 3D effects or wobbly windows. By composition we mean that everything is tear-free, flicker-free and flash-free. Wayland's motto is “Every frame is perfect.” Every pixel is exactly WHAT it should be, WHERE it should be, and be there WHEN its supposed to be there-- as dictated by the clients.

VI) Clients handle fonts, like they do now anyway.

VII) Multi-monitors is a client problem. Same with multiple graphics cards (Optimus). Wayland only wants buffers filled with pixel and to be told where to display them. It doesn't care how they got there.

VIII) Unlike X which had everything as its own window at first, Wayland supports 2 types of windows. Top-level windows, which are essentially wrappers around multiple buffers. And sub-surface windows, which is mainly targeted at video playback.

            A) They are kept coherent though, unlike X, so no tearing or thrashing of the window just because you scrolled down the YouTube comments section while the video was playing.

IX) Wayland doesn't deal in global coordinates, at least not publicly. It deal's in surface-relative coordinates. Wayland's coordinate counter is 31-bits, which means each SURFACE (read: window) can be 2,147,483,648 pixels in size.


X) As a security precaution...your screensaver and locker are apart of the compositor. This has an added benefit because your compositor (Eg: kwin) DOES understand your media keys, so even while your screen is locked you can still mute your media.

Some Misconceptions about X and Wayland

I) “X is The UNIX Way.” The Unix Way says to do one thing and do it well-- X handled printing, it handled buffer management, it was its own toolkit, it handled fonts, it was a binary interpreter, along with loads of other things. What ONE THING was X doing and what ONE THING was X doing well?

II) “X is Network Transparent.” Wrong. Its not. Core X and DRI-1 were network transparent. No one uses either one. Shared-Memory, DRI-2 and DRI-3000 are NOT network transparent, they do NOT work over the network. Modern day X comes down to synchronous, poorly done VNC. If it was poorly done, async, VNC then maybe we could make it work. But its not. Xlib is synchronous (and the movement to XCB is a slow one) which makes networking a NIGHTMARE.

III) “The Wayland developers are only re-implementing X11 because they don't understand it.” Wrong. Most of the Wayland developers ARE former X11 developers. They know how terrible it is. They know where its failings are. They want to do better than X11.

IV) “Wayland requires 3D.” Wrong. It requires compositing, but that's not necessarily 3D. Nothing in Wayland requires 3D, there is even a Pixmen backend for software rendering.

V) “Wayland can't do remoting.” Wrong. Wayland should be BETTER than X at remoting, partially do its asynchronous-by-design nature. Wayland remoting will probably look a like a higher-performance version of VNC, a prototype already exists. And this is without us even giving it serious thought about how to make it better. We could probably do better if we tried.

VI) “Wayland breaks everyone's desktop.” Also wrong. Once XWayland is finalized and merged we should have more-or-less perfect backwards compatibility because every X app just gets its own mini X-server to deal with. There is one known snag and thats with window transformations because app thinks its in the top right corner of the screen (yay global coordinates) because that client's X server is locked to the size of that client's window.

A Few Generic Advantages Of Wayland

I) “Every frame is perfect.” Wayland's main goal is that no matter the system load, no matter what is going on, it will be flicker free, tear free, and flash free. Every frame is presented in the correct and proper order (dropping frames is fine, but you wont get frame 199, followed by 205, followed by 200 because they all got sent at roughly the same time and the server picked them at random). Wayland knows what order they came in, what order they need to be displayed in, and knows WHEN they were displayed because EVERYTHING has a timestamp associated with it.

II) Minimal!!! We've learned the hard way what happens when you have something do a lot of things that also has to maintain backwards compatibility-- 26yr old mistakes are still biting us in the ass TODAY with X. Let the clients handle things, they can change-- they can break things all they want because its THEM who has to deal with the fallout of that breakage. We're helping to future-proof Wayland by reducing the surface-area for mistakes.

III) Hardware specific backends. I'm sure some people saw that the Rasberry Pi got a Wayland specific backend, and how it allowed the hardware to be taken advantage of more fully. It wont be necessary for all things, most things WONT need a hardware-specific backend...but it sure its nice to have it available. It means we have freedom, we have choice to make specific tweaks if we need to. Or if we realize down the road that the main backend has some architectural flaw in it, we can change it out with one that doesn't.

~Fin~

Related Articles
Trending Linux News