~scoopta/wlrobs#26: 
Average time to render frame much higher with dmabuf capture than pipewire / xdg capture

Opening up the stats page on OBS (Docks on top left > Stats) with the dmabuf source enabled and visible, my "average time to render frame" seems to be relatively high, at ~6ms.

One question, is this capture method supposed to be more efficient than pipewire? I have an Nvidia GPU if that changes anything. If I instead use the xdg-desktop-portal-hyprland / pipewire capture method, the avg time is about 0.8ms, a considerable decrease.

Is this just a flaw of the wlrobs plugin? Or is there something wrong? On my laptop the avg time with dmabuf is about 16ms which causes frameskips at some points.

How could I help you diagnose this issue? I'd like to use your plugin for screen capture as I have an issue with pipewire screen cap where it's not exactly 60fps. More info here: https://github.com/hyprwm/xdg-desktop-portal-hyprland/issues/201

Status
REPORTED
Submitter
~fazzi
Assigned to
No-one
Submitted
1 year, 3 days ago
Updated
11 months ago
Labels
No labels applied.

~scoopta 1 year, 2 days ago

Hmmmmm, interesting. So this is happening for me on sway at 1080p...I have not actually noticed this/been told about this before. I'm doing some digging into this

~scoopta 1 year, 2 days ago

Ok...so I did some code profiling and basically ALL of the time is in compositor land. I'm not entirely sure why or if there's anything I can actually do to fix this. Basically from the time I call wl_display_roundtrip to jump into compositor land to the time my _frame callback gets run by the compositor the ENTIRE frame capture time reported by OBS has elapsed. Everything else is basically <1ms. Unless I'm using the dmabuf protocol incorrectly I don't actually know how to fix this. I even restructured my frame caps to not do a calloc every frame because that was kinda a dumb design to start with but there was no improvement.

As far as pipewire cap goes I'm not familiar enough with the architecture but I have a suspicion that unlike my code it isn't waiting for a new frame if one isn't available and so instead of hanging the OBS rendering pipeline it just resubmits the currently available frame. But this is a wild outlandish guess of mine that could be very wrong, I really just don't know. If I'm right that might be why it's rendering with 0.8ms but still lagging for you.

~scoopta 1 year, 2 days ago

One additional interesting behavior to note is the render time seems to change at random but is always consistent. Like if I start OBS sometimes it's 7.7ms, other times it's 13ms. The time it takes is always different yet always consistent until I restart OBS. I'm beginning to wonder if what's happening is OBS is asking for a frame every 16.67ms and the compositor is waiting until vsync to deliver a frame and so the frame is always some number of ms offset from whenever OBS starts asking for frames? The fact that dropping the FPS to 30 in OBS makes the frame render time <1ms supports this hypothesis. As does the fact that attempting to capture at >60FPS causes frame drops constantly. And an exact 16.7ms frame draw time. I wish I had a 120hz monitor but my suspicion at this point is the compositor vsync is eating the entire frame render time and high render times is just a side effect of that.

~fazzi 1 year, 2 days ago*

I have a 170Hz monitor, I could test for you if you'd like (if I know how lol). This entire time I've been capturing at 60fps. Would you like me to try capturing at 170?

Let me know what info you need and how you want me to capture it. I'll try my best to help you out 👍

Edit: about the pipewire capture then, I don't think it would be possible for it to send a duplicate frame due to the fact my refresh rate is so high. I know this isn't related to your plugin, but does this seem correct?

Another edit lol: Recording with the obs-vkcapture plugin is highly efficient as well, and seems to give me silky smooth 60fps capture of my games with a render time of <1ms. How would this be possible?

~scoopta 1 year, 2 days ago

hmmmm, so just to be clear your frame time is around 6ms at 60FPS on a 170hz display? That's...very interesting and kinda wrecks my hypothesis. That's very interesting. What happens if you try to capture at >60FPS? For me if I set OBS to integer FPS mode and capture at even 62 FPS I start dropping frames with a perfect 16.7ms frame draw time(the time of a 60hz refresh). 60 gives me the frame time that caused you to open this issue. 50 FPS(PAL) gave <1ms. It's very interesting how this only starts to happen around my refresh rate...but you have a different one. Rambling aside:

  • Can you try capturing at 62FPS(since that's the FPS I decided to try with) and let me know what you get?
  • Can you try capturing at a significantly higher FPS, if you want to try 170 you CAN...but I don't want to go so high that we start hitting other unknown performance issues. Maybe like 75? If that works fine see how high you can go before you drop frames
  • What is your laptops refresh rate?
  • If needed I can push a branch with ms timing prints which was the rough way I found where the slowness was and you can build/run that but let's start with the above for now

~fazzi 1 year, 2 days ago

so just to be clear your frame time is around 6ms at 60FPS on a 170hz display?

yep.

Can you try capturing at 62FPS(since that's the FPS I decided to try with) and let me know what you get?

Sure. recording at 62fps, the time is still hovering around 5-7ms. Didn't seem to make much of a difference, maybe slightly lower, can't tell.

Can you try capturing at a significantly higher FPS...

Recording at 75fps and again time is about 5-7ms, doesn't seem to be having much of an effect on my system. Testing at 170, OBS didn't allow me to set 170 in integer value, so I set it to fractional and did 170/1. Again, average time stays the same, however now its causing relatively frequent dropped frames. This is probably because 170fps will equate to ~ 5.88ms frametime and my average time goes over that frequently to about 6.5ms.

What is your laptops refresh rate?

My laptop refresh rate is 60Hz. So in the same boat as you with that. framdrops around 60fps and over.

One interesting thing to note is for me when I have the preview on and the properties window open, the average time to render is about double, at 10-12ms or so. Does this indicate its not a Vsync issue?

~scoopta 1 year, 2 days ago

Sure. recording at 62fps, the time is still hovering around 5-7ms. Didn't seem to make much of a difference, maybe slightly lower, can't tell.

hmmmm...that makes the vsync theory sound plausible again lol. At least as far as not being able to capture >refresh rate. My system is no slouch so the differing behaviors at 62FPS doesn't make a lot of sense otherwise. I do wonder why your frame time is so high though. My frame time was quite low at 50FPS and below so if it were all vsync then your frame time shouldn't be high due to your obscene refresh rate.

Recording at 75fps and again time is about 5-7ms, doesn't seem to be having much of an effect on my system. Testing at 170, OBS didn't allow me to set 170 in integer value, so I set it to fractional and did 170/1. Again, average time stays the same, however now its causing relatively frequent dropped frames. This is probably because 170fps will equate to ~ 5.88ms frametime and my average time goes over that frequently to about 6.5ms.

I'm thinking I'll push a branch with timing prints just to see if my theory holds about all the time being stuck in compositor land.

One interesting thing to note is for me when I have the preview on and the properties window open, the average time to render is about double, at 10-12ms or so. Does this indicate its not a Vsync issue?

I now really wish I had opened the properties window during my testing last night. I did not and unfortunately I can't test this thoroughly while at work but it something I will check tonight.

~fazzi 1 year, 2 days ago

your frame time shouldn't be high due to your obscene refresh rate

hahaha, nowadays 170Hz isn't too obscene, obscene to me is those 360Hz, 500Hz, etc monitors!

I'll push a branch with timing prints

Sounds good to me, take your time, no need to rush :)

unfortunately I can't test this thoroughly while at work

I wouldn't expect you to test this let alone reply to my comments at work, so thanks for that. Again, please take your time. This isn't urgent at all.

~scoopta a year ago

Can you try the dmabuf_perf_test bookmark? You can clone it with hg clone -r dmabuf_perf_test https://hg.sr.ht/~scoopta/wlrobs or if you already have a local copy of the repo hg pull -ur dmabuf_perf_test. Please run obs from a terminal, you will get A LOT of prints(several per frame). That being said even running this I can capture at 60fps easily so please try that. You don't need to provide all the prints, just a handful is adequate. On my local machine all the time is between wl_rt and START _frame which is compositor time I don't know how to reduce. The prints are in (ms * 10) so a print of 150 is 15.0ms.

~fazzi 11 months ago

Hey there - here are my results. Just a small snippet:

START _frame 54
END _frame 54
object 54
ready 56
wl_rt 0
START _frame 48
END _frame 48
object 48
ready 50
wl_rt 0
START _frame 37
END _frame 37
object 37
ready 39
wl_rt 0
START _frame 72
END _frame 72
object 72
ready 74
wl_rt 0
START _frame 24
END _frame 24
object 24
ready 26
wl_rt 0
START _frame 47
END _frame 47
object 47
ready 49
wl_rt 0
START _frame 33
END _frame 33
object 33
ready 35
wl_rt 0
START _frame 59
END _frame 59
object 59
ready 61
wl_rt 0
START _frame 20
END _frame 20
object 20
ready 22

seems to be about 2-7ms or so. This is at 60fps btw on Hyprland. But yes you are right, the rest of the pipeline is very short, only a few small fractions of a millisecond.

~scoopta 11 months ago

yeah, your results are the same as mine. The wl_rt print is right before I hand off to the compositor to do the capture and then START _frame is printed AS SOON as my callback gets run, literally before any other code is run...and that's where all the time is. I'll double check my usage of the dmabuf protocol and make sure I'm not missing something or doing something dumb but it doesn't appear like my code is being slow.

Register here or Log in to comment, or comment via email.