I tried updating my toy webcam program trideo to the latest gio because I've been running it successfully with the compute renderer for a while now, and I wanted to see the effect of recent changes.
It now hangs my GPU. This was on a Fedora 33 GNOME Desktop running Wayland, but trideo itself was running as an XWayland application.
This is the commit that broke it.
I was able to capture the GPU hang info in case it's useful. You can find that here.
I was able to replicate this on different intel hardware. This GPU hang comes from an Arch Linux system running GNOME Wayland (trideo is still running under XWayland here).
The linked commit doesn't seem relevant. Did you mean to link to a Gio commit?
No, I was trying to show the version change from working to not in my go.mod. I haven't bisected this (I have to reboot each time this happens).
Whoops! https://git.sr.ht/~whereswaldon/trideo/commit/a885028bd64274f725e14ee1adc1d0fb6ba12097 is what I meant, sorry. Accidentally copied the parent commit
Can you please try the potential fix described in https://todo.sr.ht/~eliasnaur/gio/214#event-90218? Thanks.
trideoon the latest gio commit did something interesting with the compute renderer. The GPU did hang, but the Gio program kept right on running and rendering frames. It didn't get a GL error and crash or anything.
My GPU hang (from
[Tue Jul 20 07:48:55 2021] i915 0000:00:02.0: [drm] Resetting rcs0 for preemption time out [Tue Jul 20 07:48:55 2021] i915 0000:00:02.0: [drm] trideo context reset due to GPU hang [Tue Jul 20 07:48:55 2021] i915 0000:00:02.0: [drm] GPU HANG: ecode 9:1:85ddfffa, in trideo  [Tue Jul 20 07:49:26 2021] i915 0000:00:02.0: [drm] Reducing the compressed framebuffer size. This may lead to less power savings than a non-reduced-size. Try to increase stolen memory size if available in BIOS.
I'm going to try the kitchen on this machine next, in case it's application-dependent, but I don't want to lose this partially-written comment in a reboot so I'm posting it now.
Okay, the kitchen is running fine on that latest commit. Something
trideois doing is causing this hang. Trideo is very simple. It draws a bunch of triangles every frame on top of a solid black background. It basically invokes this function hundreds of times each frame. Can you try it on a linux box with a webcam and see if you can replicate the hang? You may need to adjust the hardcoded webcam device path to the proper one for your local system.
Also, I tried the example program for the chat library ~jackmordaunt and I are building, and it seems to exhaust the material atlas?
git clone https://git.sr.ht/~gioverse/chat cd chat go get gioui.org@latest GIORENDERER=forcecompute go run ./example/kitchen
I get a few frames, then the program crashes with:
error: premature window close: compute: no space left in material atlas exit status 1
Now this could totally be an application error on our part. Perhaps the images we're loading are being duplicated in the texture cache or something without our knowledge. This does run on the old renderer though, so it seems on the surface like a difference in behavior.
On a bright note, I am able to run sprig on the compute renderer, so it's definitely working with a non-trivial application. The main branch isn't on the right gio commit, but the
Thanks. Would it be possible to reduce trideo so that it don't require a webcam? I don't have an external webcam, unfortunately.
I have reduced trideo to just operate on a jpeg image provided as argv. You can find that version on this branch.
It does reproduce the GPU hang for me in this form.
Thank you for the simple reproducer. Unfortunately it doesn't hang for me. I'm running Sway on Fedora 34. Is there a chance you can reproduce the hang on F34?
I can try to do that tomorrow. I got it in GNOME Wayland on Arch. Can you try GNOME on F34 while I work to get an F34 system? I no longer have one since leaving my last job.
Well, uh, this is awkward. I can no longer reproduce either. After updating to the latest code, it works really well. It doesn't hang at all. I'm going to keep experimenting with it for a bit, but we might be able to close this. Trideo's performance is really excellent now! I can run it with absurd numbers of triangles (2000 per frame) and it keeps up. Gio used to be the bottleneck, but now it's the CPU-bound image processing instead. Well done!
Turns out I'm the idiot: by adding "GIORENDERER=forcecompute" (and spelling it correctly) I can easily reproduce the hang. Sorry for putting you on a wild goose chase.
FWIW, there has been no change to the compute shaders from 4f40b58e0d14..8cec7e04eb71 so if the issue is gone at your end it probably means caching is somehow hiding the issue.
Hm, I spoke too soon. The https://gioui.org/commit/b87cbc04f37453a064201a8590b0a23a169cf3f5 change fixes the hang for me, which indicates that the command stream to the compute programs before the change was sometimes incorrect. This is quite likely, so I'm going to close this issue for now and concentrate on #214 (NVIDIA), #221 (font glitches) and issues on my own Pixel 1 phone.
Chris, I've pushed a set of commits that lifts the restrictions your chat kitchen example ran into. The example now runs without issues on my mahcine.
Yeah, I'm no longer able to hang the chat example or
trideowith the compute renderer. Thanks!