Yes, I just saw that. It's the same issue that we had with Xephyr where it would hang with a black window where the loop gets stuck in Ppoll not seeing the first MapNotify and Expose events. Need to call
w.h.handleEvents()at the beginning of the event loop as well, or before it.
Slight update to the logic, the frame rate is much less erratic (still without actual WM sync). Also much more code comments for those interested: https://github.com/db47h/gio/tree/x11_experimental
But looks like I have killed thetotcounter when profiling...
As part of implementing
_NET_WM_FRAME_DRAWNsupport, I've rewritten the X11 event handling loop to match Wayland's:
_NET_WM_FRAME_DRAWNsupport is not in yet, it feels laggier than the current implementation on a composited window manager (this is expected). As it is, it will however be the default on WMs not supporting
Could you guys give it a try and see how it affects all of you? VSync is still on by default, I'd suggest also disabling it and monitoring CPU usage while interacting with the app (in
BTW, on a 60 Hz display, draw times of about 16ms (1/60s) are what we are looking for. Anything less and we're drawing more frames than the display can actually render (and wasting energy), anything higher and we're skipping frames (jitter).
Thanks for your input ~kaey. Fullscreen applications (like shooters) have the easy option to bypass the compositor by setting a single window property, thus reducing lag even with vsync, which can be further reduced by using XCB where available (everywhere these days?) instead of xlib. Windowed applications are another story, and unfortunately disabling vsync in Gio is not an option (think CPU usage & battery drain on laptops).
Thanks for checking Alessandro. What's interesting is that Cinnamon supports
_NET_WM_FRAME_TIMINGS! The first one is what we need to sync drawing with the WM. So I read the docs again and it turns out that it's not mandatory for the WM to advertise
_NET_WM_FRAME_DRAWN. There's still hope for Cinnamon :)
OpenBox... I did test on OpenBox 3.6.1 on Ubuntu 18.04, nvidia drivers and I get a solid ~16.5 ms per frame, with the occasional glitch (short enough that it's impossible to read the actual frame time). There's definitely input lag, but not worse than with Gnome Shell.
I wonder what kind of frame rate glxgears reports (I think it's in the mesa-demos package on Arch). You might want to try both windowed and fullscreen.
~eliasnaur, you said you can reproduce this, is it with an Intel GPU as well?
The major difference between Gio and Shiny is that Gio uses Xlib's C API while Shiny uses a pure Go implementation of the newer XCB protocol (https://github.com/BurntSushi/xgb and https://github.com/BurntSushi/xgbutil). XCB has much better multithreading and lower latency overall (even for pure C applications) and to top it off, the Go implementation doesn't have a single bit of C code. That's very likely why you observe less lag in Shiny.
After digging a bit more on the vsync side of things it appears that ~eliasnaur's suggestion that the compositor and the app are somehow fighting over vsync is correct. Mainstream compositing window managers like Gnome-Shell and KWin support frame synchronization via the NETWMSYNCREQUEST and the very undocumented NETWMFRAMEDRAWN EWMH hints. I intend to try and add support for these as this should help reduce lag on these WMs (i.e. most linux users? there's hoping ;)
On Cinnamon, I don't know if that will help. There's a bug report (https://github.com/linuxmint/cinnamon/issues/8665) reporting that NETWMSYNCREQUEST is not implemented, I however doubt it since Cinnamon's WM (Muffin) is a fork of Mutter which supports it (or it's a very old fork?). A quick way to test it is to run the following command:
xprop -root | grep ^_NET_SUPPORTED
As for OpenBox, that won't help at all because it's not compositing (it supports NETWMSYNCREQUEST though, but on its own it only helps reduce flickering while resizing). And I really can't figure out what's happening there, unless you're running Compton.
The sleep time in the x11-vsync-hacks branch should be less than 16ms. Anything more and you end up skipping frames.
Another thing that could cause issues is that handleEvents() does not return while there are still xevents to be processed (that's why it doesn't send draw events by itself). In the early implementations of the driver, I had counter measures in place, but after some proper profiling, it turned out that in a worst case scenario it always returned after at most 10 events or so. Would your environments behave differently?