Since Canonical's Chase Douglas published his xorg-devel email in which he laid out their proposed specification for X.Org Gesture Extension protocol v1.0, there's been some dissenting opinions by X.Org developers about how the gesture recognition should be handled. Part of this proposed gesture extension provides an interface for X clients to register and receive gesture primitive events and X clients to then act as a gesture engine. Canonical is of the belief that the gesture handling should be handled server-side by X while the X.Org developers -- such as Kristian Høgsberg and MPX-creator Peter Hutterer -- want this input gesture recognition to be handled client-side out of the X.Org Server.
In a new email by Chase, Canonical justifies the gesture recognition per the X.Org Gesture Extension proposal to be handled within the server for the following reasons: 1. With handling of input events (touches) that may span multiple windows, but the recognized gesture event is supposed to only affect the first window, the X.Org Server shouldn't be firing off the input events to any other windows. 2. If the handling is done by the client, there's also the possibility the recognition would occur twice -- once by the window manager and then again by any window clients that may want to recognize their own input events. Canonical admits though that if this task were to be performed twice there would be little in the way of latency problems. 3. Lastly, multi-touch events are not exposed to the client in the X.Org Server used by Ubuntu 10.10, where Canonical wants this first cut of UTouch / X.Org Gesture Extension to be deployed.
The common position against having the gesture recognition engine in the X Server and instead on the client-side (such as within the window manager instead), can be summarized by Carsten Haitzler with "I absolutely agree with peter on this. Frankly the problem is that input from a mouse, or a tablet or multiple touch-points is ambiguous. You may be painting in GIMP - not trying to "swipe to scroll". I can go on with examples (dragging a slider inside a scrollable area as opposed to wanting to scroll with a drag). Only the client has enough detailed knowledge of the window contents, application mode etc. To possibly make a reliable guess as to which one to do. It's X's job to provide as much device input to the client as it needs in a sane digestible way to make such a decision, but... that's [in my honest opinion] where the server's job ends."
Chase does acknowledge, "It's true that the logic behind point one may be perfectly fine, but having recognition through X inserts a lot of extra code into the server. If we are content with touches being split up into window regions before recognition occurs, then we may be able to get around the need for the X Gesture extension completely. The window manager use case could be supplied through input grabbing and event replaying."
We will see how this ongoing architectural debate ends, but for Ubuntu 10.10 at least it will be handled server-side.