This question might pertain to Raspberry PI 400 only due to its constained hardware but I believe the issue is software/driver based. Maybe some geniuses here have the answeres?
I ask because I believe there is an issue with unnecessary mem copies that fills up the memory bandwidth and limits Open GL inside X11 windows.
SETUP:
RPi 400 with Debian Buster and Mesa 21.2 and vc4-kms-v3d as dtoverlay, 1080p monitor.
TEST (using glxgears):
For me there is a reduction in frames per second as I make the window larger, going from 1300 at default size down to 52fps when maximized. At the same time CPU usage reduces down to between 4-6%. Reducing the window size beyond the default size increases the frame rate as well as the CPU usage until CPU nears 50%. Beyond this point no improvment to FPS can be seen.
MY CONCLUSION:
This together with profiling by other users showing that the swap action takes alot of time on the PI, to me sounds like there is a memory bandwidth limitation that is hit. But as 1080p at 60FPS equals <0.5 GB/s and mem copy tests show 4-5 GB/s for the PI, somewhere we have alot memory copies for each swap.
One guess I had was that the since the kernel (5.10) uses VC4 drivers when targeting the Raspberry PI 400, even though it has VC 6 hardware, which lacks the correct Open GL support and this is what X uses there might be some bogus copying going on between the v3d of Mesa and the vc4 support of the kernel. But these are speculations and I am not a C++ or a linux kernel guy.
Any suggestions on how to verify the issue, detect where the bandwidth is going or perhaps how to workaround it are super welcome!
Thanks,
J
I ask because I believe there is an issue with unnecessary mem copies that fills up the memory bandwidth and limits Open GL inside X11 windows.
SETUP:
RPi 400 with Debian Buster and Mesa 21.2 and vc4-kms-v3d as dtoverlay, 1080p monitor.
TEST (using glxgears):
For me there is a reduction in frames per second as I make the window larger, going from 1300 at default size down to 52fps when maximized. At the same time CPU usage reduces down to between 4-6%. Reducing the window size beyond the default size increases the frame rate as well as the CPU usage until CPU nears 50%. Beyond this point no improvment to FPS can be seen.
MY CONCLUSION:
This together with profiling by other users showing that the swap action takes alot of time on the PI, to me sounds like there is a memory bandwidth limitation that is hit. But as 1080p at 60FPS equals <0.5 GB/s and mem copy tests show 4-5 GB/s for the PI, somewhere we have alot memory copies for each swap.
One guess I had was that the since the kernel (5.10) uses VC4 drivers when targeting the Raspberry PI 400, even though it has VC 6 hardware, which lacks the correct Open GL support and this is what X uses there might be some bogus copying going on between the v3d of Mesa and the vc4 support of the kernel. But these are speculations and I am not a C++ or a linux kernel guy.
Any suggestions on how to verify the issue, detect where the bandwidth is going or perhaps how to workaround it are super welcome!
Thanks,
J
Comment