Opening up the stats page on OBS (Docks on top left > Stats) with the dmabuf source enabled and visible, my "average time to render frame" seems to be relatively high, at ~6ms.
One question, is this capture method supposed to be more efficient than pipewire? I have an Nvidia GPU if that changes anything. If I instead use the xdg-desktop-portal-hyprland / pipewire capture method, the avg time is about 0.8ms, a considerable decrease.
Is this just a flaw of the wlrobs plugin? Or is there something wrong? On my laptop the avg time with dmabuf is about 16ms which causes frameskips at some points.
How could I help you diagnose this issue? I'd like to use your plugin for screen capture as I have an issue with pipewire screen cap where it's not exactly 60fps. More info here: https://github.com/hyprwm/xdg-desktop-portal-hyprland/issues/201
Hmmmmm, interesting. So this is happening for me on sway at 1080p...I have not actually noticed this/been told about this before. I'm doing some digging into this
Ok...so I did some code profiling and basically ALL of the time is in compositor land. I'm not entirely sure why or if there's anything I can actually do to fix this. Basically from the time I call
wl_display_roundtrip
to jump into compositor land to the time my_frame
callback gets run by the compositor the ENTIRE frame capture time reported by OBS has elapsed. Everything else is basically <1ms. Unless I'm using the dmabuf protocol incorrectly I don't actually know how to fix this. I even restructured my frame caps to not do acalloc
every frame because that was kinda a dumb design to start with but there was no improvement.As far as pipewire cap goes I'm not familiar enough with the architecture but I have a suspicion that unlike my code it isn't waiting for a new frame if one isn't available and so instead of hanging the OBS rendering pipeline it just resubmits the currently available frame. But this is a wild outlandish guess of mine that could be very wrong, I really just don't know. If I'm right that might be why it's rendering with 0.8ms but still lagging for you.
One additional interesting behavior to note is the render time seems to change at random but is always consistent. Like if I start OBS sometimes it's 7.7ms, other times it's 13ms. The time it takes is always different yet always consistent until I restart OBS. I'm beginning to wonder if what's happening is OBS is asking for a frame every 16.67ms and the compositor is waiting until vsync to deliver a frame and so the frame is always some number of ms offset from whenever OBS starts asking for frames? The fact that dropping the FPS to 30 in OBS makes the frame render time <1ms supports this hypothesis. As does the fact that attempting to capture at >60FPS causes frame drops constantly. And an exact 16.7ms frame draw time. I wish I had a 120hz monitor but my suspicion at this point is the compositor vsync is eating the entire frame render time and high render times is just a side effect of that.
I have a 170Hz monitor, I could test for you if you'd like (if I know how lol). This entire time I've been capturing at 60fps. Would you like me to try capturing at 170?
Let me know what info you need and how you want me to capture it. I'll try my best to help you out 👍
Edit: about the pipewire capture then, I don't think it would be possible for it to send a duplicate frame due to the fact my refresh rate is so high. I know this isn't related to your plugin, but does this seem correct?
Another edit lol: Recording with the obs-vkcapture plugin is highly efficient as well, and seems to give me silky smooth 60fps capture of my games with a render time of <1ms. How would this be possible?
hmmmm, so just to be clear your frame time is around 6ms at 60FPS on a 170hz display? That's...very interesting and kinda wrecks my hypothesis. That's very interesting. What happens if you try to capture at >60FPS? For me if I set OBS to integer FPS mode and capture at even 62 FPS I start dropping frames with a perfect 16.7ms frame draw time(the time of a 60hz refresh). 60 gives me the frame time that caused you to open this issue. 50 FPS(PAL) gave <1ms. It's very interesting how this only starts to happen around my refresh rate...but you have a different one. Rambling aside:
- Can you try capturing at 62FPS(since that's the FPS I decided to try with) and let me know what you get?
- Can you try capturing at a significantly higher FPS, if you want to try 170 you CAN...but I don't want to go so high that we start hitting other unknown performance issues. Maybe like 75? If that works fine see how high you can go before you drop frames
- What is your laptops refresh rate?
- If needed I can push a branch with ms timing prints which was the rough way I found where the slowness was and you can build/run that but let's start with the above for now
so just to be clear your frame time is around 6ms at 60FPS on a 170hz display?
yep.
Can you try capturing at 62FPS(since that's the FPS I decided to try with) and let me know what you get?
Sure. recording at 62fps, the time is still hovering around 5-7ms. Didn't seem to make much of a difference, maybe slightly lower, can't tell.
Can you try capturing at a significantly higher FPS...
Recording at 75fps and again time is about 5-7ms, doesn't seem to be having much of an effect on my system. Testing at 170, OBS didn't allow me to set 170 in integer value, so I set it to fractional and did 170/1. Again, average time stays the same, however now its causing relatively frequent dropped frames. This is probably because 170fps will equate to ~ 5.88ms frametime and my average time goes over that frequently to about 6.5ms.
What is your laptops refresh rate?
My laptop refresh rate is 60Hz. So in the same boat as you with that. framdrops around 60fps and over.
One interesting thing to note is for me when I have the preview on and the properties window open, the average time to render is about double, at 10-12ms or so. Does this indicate its not a Vsync issue?
Sure. recording at 62fps, the time is still hovering around 5-7ms. Didn't seem to make much of a difference, maybe slightly lower, can't tell.
hmmmm...that makes the vsync theory sound plausible again lol. At least as far as not being able to capture >refresh rate. My system is no slouch so the differing behaviors at 62FPS doesn't make a lot of sense otherwise. I do wonder why your frame time is so high though. My frame time was quite low at 50FPS and below so if it were all vsync then your frame time shouldn't be high due to your obscene refresh rate.
Recording at 75fps and again time is about 5-7ms, doesn't seem to be having much of an effect on my system. Testing at 170, OBS didn't allow me to set 170 in integer value, so I set it to fractional and did 170/1. Again, average time stays the same, however now its causing relatively frequent dropped frames. This is probably because 170fps will equate to ~ 5.88ms frametime and my average time goes over that frequently to about 6.5ms.
I'm thinking I'll push a branch with timing prints just to see if my theory holds about all the time being stuck in compositor land.
One interesting thing to note is for me when I have the preview on and the properties window open, the average time to render is about double, at 10-12ms or so. Does this indicate its not a Vsync issue?
I now really wish I had opened the properties window during my testing last night. I did not and unfortunately I can't test this thoroughly while at work but it something I will check tonight.
your frame time shouldn't be high due to your obscene refresh rate
hahaha, nowadays 170Hz isn't too obscene, obscene to me is those 360Hz, 500Hz, etc monitors!
I'll push a branch with timing prints
Sounds good to me, take your time, no need to rush :)
unfortunately I can't test this thoroughly while at work
I wouldn't expect you to test this let alone reply to my comments at work, so thanks for that. Again, please take your time. This isn't urgent at all.
Can you try the
dmabuf_perf_test
bookmark? You can clone it withhg clone -r dmabuf_perf_test https://hg.sr.ht/~scoopta/wlrobs
or if you already have a local copy of the repohg pull -ur dmabuf_perf_test
. Please run obs from a terminal, you will get A LOT of prints(several per frame). That being said even running this I can capture at 60fps easily so please try that. You don't need to provide all the prints, just a handful is adequate. On my local machine all the time is betweenwl_rt
andSTART _frame
which is compositor time I don't know how to reduce. The prints are in (ms * 10) so a print of 150 is 15.0ms.
Hey there - here are my results. Just a small snippet:
START _frame 54 END _frame 54 object 54 ready 56 wl_rt 0 START _frame 48 END _frame 48 object 48 ready 50 wl_rt 0 START _frame 37 END _frame 37 object 37 ready 39 wl_rt 0 START _frame 72 END _frame 72 object 72 ready 74 wl_rt 0 START _frame 24 END _frame 24 object 24 ready 26 wl_rt 0 START _frame 47 END _frame 47 object 47 ready 49 wl_rt 0 START _frame 33 END _frame 33 object 33 ready 35 wl_rt 0 START _frame 59 END _frame 59 object 59 ready 61 wl_rt 0 START _frame 20 END _frame 20 object 20 ready 22
seems to be about 2-7ms or so. This is at 60fps btw on Hyprland. But yes you are right, the rest of the pipeline is very short, only a few small fractions of a millisecond.
yeah, your results are the same as mine. The wl_rt print is right before I hand off to the compositor to do the capture and then START _frame is printed AS SOON as my callback gets run, literally before any other code is run...and that's where all the time is. I'll double check my usage of the dmabuf protocol and make sure I'm not missing something or doing something dumb but it doesn't appear like my code is being slow.