~kennylevinsen/greetd#38: 
Since upgrading to version 0.9.0 I've been getting logged out after Dota2 crashes

Hi there,

I'm using greetd (0.9.0) along with gtkgreet (0.7). After upgrading greetd from version 0.8.0 to 0.9.0 I have experienced logouts directly after playing a game of Dota2 and the game crashes, causing the whole user session to terminate. This has happened 3 times now.

Sometimes after a new update of Dota2 it will tend to crash until there is a bugfix in the game or the graphics drivers. This is okay, but having my whole user session get nuked along with it is surprising.

After playing a game I'll suddenly hear my PC fans speed up, and then I am back in the gtkgreet greeter screen and my user session is gone. My PC does not shut down. And if I log in again it is a brand new session with none of my applications running.

There is no coredump, but the dmesg output (attached) might be useful.

Additional info:

OS: Arch Linux Kernel: 6.1.7-arch1-1 CPU: Intel i5-4460 (4) @ 3.400GHz GPU: AMD RX580

Apologies if I haven't formatted any of this correctly. I don't have a sr.ht account, and I'm not sure how this email gets formatted in the TODO item in your repo.

Regards, Stephan Snyman

Status
REPORTED
Submitter
Stephan Snyman
Assigned to
No-one
Submitted
1 year, 11 months ago
Updated
1 year, 10 months ago
Labels
No labels applied.

~kennylevinsen 1 year, 11 months ago

Once your session is active, greetd only waits for it to end and does not interact with anything. If you see gtkgreet again, while it could mean that greetd got restarted, it is more likely to mean that your display server crashed and greetd just did its job by letting you log back in.

journalctl should have some info about what happened and what, if anything, crashed.

Stephan Snyman 1 year, 11 months ago · edit

it is more likely to mean that your display server crashed and greetd just did its job by letting you log back in

That does make more sense.

journalctl should have some info about what happened and what, if anything, crashed.

Digging through journalctl below are the logs from the GPU error up until the close of the user session. I don't see anything about sway (which is what I'm using) crashing or anything. Just the GPU itself resetting. I don't know a lot about low level stuff, but you are right, looks like this is not a greetd issue, so I guess you can close the issue. Maybe this is just a bad bug in Dota2 or maybe in mesa. Thanks for your assistance and for the free software that you provide. I appreciate it. :)

Jan 23 21:28:00 rooiratel-pc kernel: [drm:amdgpu_job_timedout [amdgpu]] ERROR ring gfx timeout, signaled seq=1365332, emitted seq=1365334 Jan 23 21:28:00 rooiratel-pc kernel: [drm:amdgpu_job_timedout [amdgpu]] ERROR Process information: process dota2 pid 17636 thread VKRenderThread pid 17890 Jan 23 21:28:00 rooiratel-pc kernel: amdgpu 0000:01:00.0: amdgpu: GPU reset begin! Jan 23 21:28:00 rooiratel-pc kernel: amdgpu: cp is busy, skip halt cp Jan 23 21:28:00 rooiratel-pc kernel: amdgpu: rlc is busy, skip halt rlc Jan 23 21:28:00 rooiratel-pc kernel: amdgpu 0000:01:00.0: amdgpu: BACO reset Jan 23 21:28:01 rooiratel-pc kernel: amdgpu 0000:01:00.0: amdgpu: GPU reset succeeded, trying to resume Jan 23 21:28:01 rooiratel-pc kernel: [drm] PCIE GART of 256M enabled (table at 0x000000F400500000). Jan 23 21:28:01 rooiratel-pc kernel: [drm] VRAM is lost due to GPU reset! Jan 23 21:28:01 rooiratel-pc kernel: [drm] UVD and UVD ENC initialized successfully. Jan 23 21:28:01 rooiratel-pc kernel: [drm] VCE initialized successfully. Jan 23 21:28:01 rooiratel-pc kernel: amdgpu 0000:01:00.0: amdgpu: recover vram bo from shadow start Jan 23 21:28:01 rooiratel-pc kernel: amdgpu 0000:01:00.0: amdgpu: recover vram bo from shadow done Jan 23 21:28:01 rooiratel-pc kernel: [drm] Skip scheduling IBs! Jan 23 21:28:01 rooiratel-pc kernel: [drm] Skip scheduling IBs! Jan 23 21:28:01 rooiratel-pc kernel: [drm] Skip scheduling IBs! Jan 23 21:28:01 rooiratel-pc kernel: amdgpu 0000:01:00.0: amdgpu: GPU reset(2) succeeded! Jan 23 21:28:01 rooiratel-pc kernel: [drm] Skip scheduling IBs! Jan 23 21:28:01 rooiratel-pc kernel: [drm] Skip scheduling IBs! Jan 23 21:28:01 rooiratel-pc kernel: [drm] Skip scheduling IBs! Jan 23 21:28:01 rooiratel-pc kernel: [drm] Skip scheduling IBs! Jan 23 21:28:01 rooiratel-pc kernel: [drm] Skip scheduling IBs! Jan 23 21:28:01 rooiratel-pc kernel: [drm] Skip scheduling IBs! Jan 23 21:28:01 rooiratel-pc kernel: [drm] Skip scheduling IBs! Jan 23 21:28:01 rooiratel-pc kernel: [drm] Skip scheduling IBs! Jan 23 21:28:01 rooiratel-pc kernel: [drm] Skip scheduling IBs! Jan 23 21:28:01 rooiratel-pc kernel: [drm] Skip scheduling IBs! Jan 23 21:28:01 rooiratel-pc kernel: [drm] Skip scheduling IBs! Jan 23 21:28:01 rooiratel-pc kernel: [drm] Skip scheduling IBs! Jan 23 21:28:01 rooiratel-pc kernel: [drm] Skip scheduling IBs! Jan 23 21:28:01 rooiratel-pc kernel: [drm] Skip scheduling IBs! Jan 23 21:28:01 rooiratel-pc greetd[551]: pam_unix(greetd:session): session closed for user rooiratel

~kennylevinsen 1 year, 11 months ago

sway is most likely dying from the GPU reset. Make sure you're running sway 1.8, or try master of sway and wlroots.

If you want to collect sway's output, start sway as systemd-cat --identifier sway -- sway. This is best done by starting a wrapper script instead of starting sway directly - e.g., make /usr/local/bin/sway-run, make it executable and put this in it:

#!/bin/sh
# any environment variables you'd like
exec systemd-cat --identifier=sway -- sway $@

Then you will be able to read the output with journalctl -t sway.

Stephan Snyman 1 year, 11 months ago · edit

Thanks, I'll give that a try sometime later in the week. Hopefully I can find something that I can submit as a bug request to sway.

Stephan Snyman 1 year, 11 months ago · edit

I have set up my sway startup like you mentioned above, and can successfully read the output of  journalctl -t sway.

But now of course dota hasn't crashed since. I haven't updated any of my system packages yet, and I don't think there has been a dota update. But at least I am ready now in case it does happen again.

Stephan Snyman 1 year, 10 months ago · edit

Okay it finally happened again. On the 25th of February.

I have attached a file with all the output of |journalctl -t sway since I made the changes you suggested.

Doesn't seem like it gives any more info than before.

And looking at line 87 of sway's common/ipc-client.c doesn't give me any more info that I can do something about.


struct ipc_response *ipc_recv_response(int socketfd) {
     char data[IPC_HEADER_SIZE];

     size_t total = 0;
     while (total < IPC_HEADER_SIZE) {
         ssize_t received = recv(socketfd, data + total, IPC_HEADER_SIZE 
- total, 0);
         if (received <= 0) {
             sway_abort("Unable to receive IPC response");  // line 87
         }
         total += received;
     }

|||

~kennylevinsen 1 year, 10 months ago

There is no attachment, so cannot see what you refer to.

And looking at line 87 of sway's common/ipc-client.c

IPC errors are uninteresting, as they are just clients that fail because sway died. Note that the output from sway will also contain the output of any application it started.

Stephan Snyman 1 year, 10 months ago · edit

I just double checked my sent folder, and the last email I sent in this thread definitely had a .txt file attached. I guess sr.ht can't handle attachments in emails. I would have thought that it would either automatically copy it to something like paste.sr.ht, or at least be there in a normal email client.

Anyway here is a link to the contents of the file: https://pastebin.com/Vqp3FL4u

That's everything in the logs since I made the changes to my sway startup, until the 25th Feb.

It's just the boring IPC errors like you mentioned. So now I'm not sure where to look.

~kennylevinsen 1 year, 10 months ago*

Adding -d would make sway enable debug output, which should make logs a bit more interesting.

Assuming there is no report of a greetd crash (see journalctl -u greetd - it would mention a panic in that case), then this is a sway or amdgpu issue and I'd recommend asking in #sway on IRC.

Stephan Snyman 1 year, 10 months ago · edit

Okay I added -d and the debug logs are working.

The crash happened again today. I have put the debug logs for today on a pastebin site since last time the bug tracker deleted my email attachments.

Logs: https://pastebin.com/EukN02kp

Register here or Log in to comment, or comment via email.