Announcement

Collapse
No announcement yet.

Broadcom Open-Sources VideoCore IV 3D Graphics Stack

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #61
    Originally posted by ssvb View Post
    Zero copy is always better than one. Without compositing enabled in the X11 window manager and when hardware overlays are available to be controlled by the DDX driver, this is already in use - http://ssvb.github.io/2013/02/01/new...dx-driver.html. The window decorations do not matter because they are rendered by the X server just like for any other window (DRI2 buffers are not involved). The rectangular area drawn by a GLES application is living in a hardware overlay, with scanout configured directly from the current DRI2 buffer, alternating buffers on vblank to avoid tearing. We need to do a copy to the framebuffer (or to the window backing pixmap) only when somebody really wants to read from there. http://en.wikipedia.org/wiki/Lazy_evaluation for the win
    For comparison, on Weston on rpi, you will always get as close to zero-copy EGL clients as imaginable. The only copy that might happen is done by the firmware in secret, if it deems that the element scenegraph is too complex. For a full-screen app like a game, with only few additional elements like a mouse cursor, I believe the firmware should avoid the secret copy.

    It does not matter whether Weston has to composite (show something else at the same time), or whether the GL rendered window is partially obscured or not, or how the applications are coded as long as they are native Wayland apps (no need to be rpi-specific apps). On Wayland, there is no case where we would need to do an additional lazy copy because something wants to read something.


    Btw. how do you guarantee, that the decorations and your DRI2 buffer stay in sync wrt. size while resizing the window? So that you don't accidentally show a picture where they are at disagreeing sizes?

    Comment


    • #62
      Originally posted by ssvb View Post
      Yes, with poor X server DDX drivers you can easily hit scenarios when the performance gets killed. The solution is not to use poor drivers. Admittedly this is rather difficult. Especially on ARM hardware, where certain performance killing anti-patterns are surprisingly popular
      Well, technically you are right.. but really the only good example (from performance standpoint) of DDX driver is intel SNA, and really no one but intel can afford that massive DDX driver investment. Go compare the # of LoC of SNA to intel's mesa driver some day!

      So yes, some of the things that I am saying are not possible, what I should actually be saying that they are not *practical*

      (Possibly once glamor is in better shape, that will be the way forward for mobile DDX drivers.. or rather, nearly *all* DDX drivers.. at least then the massive driver investment to handle all the different x11 render paths can be done once and shared across all drivers. Still doesn't help with the overlay situation, though.)

      Originally posted by ssvb View Post
      Zero copy is always better than one. Without compositing enabled in the X11 window manager and when hardware overlays are available to be controlled by the DDX driver, this is already in use - http://ssvb.github.io/2013/02/01/new...dx-driver.html. The window decorations do not matter because they are rendered by the X server just like for any other window (DRI2 buffers are not involved). The rectangular area drawn by a GLES application is living in a hardware overlay, with scanout configured directly from the current DRI2 buffer, alternating buffers on vblank to avoid tearing. We need to do a copy to the framebuffer (or to the window backing pixmap) only when somebody really wants to read from there. http://en.wikipedia.org/wiki/Lazy_evaluation for the win
      As long as the GLES applications don't rely on the EGL_NATIVE_RENDERABLE feature, everyone should be happy. There are some shortcomings in the current implementation though, but nothing really unsolvable.
      yes.. I've seen that. It is a really cute hack. But it will never be possible to make it perfect. (Moving windows around, stacking order, multiple gl apps, $random_users_favorite_windowmanager, etc.) The best you'll be able to do is gracefully fall back to a slow path.

      Weston otoh can easily do the same thing, with no hacks. And once atomic modeset is upstream in kernel, it will be able to do it pixel-perfect. This is why I'm so pro-wayland. Yes, there are things that given enough time/effort/compromize/etc can be hacked into x11. But why, when wayland lets you do it cleanly/easily?

      Originally posted by ssvb View Post
      With the x11 compositing window manager and redirected windows, everything surely gets more complicated. But in theory the overhead of dealing with window decorations should be not so dramatic as an extra buffer copy per frame, see https://github.com/ssvb/xf86-video-fbturbo/issues/3. However this has not been really implemented yet, so I could be overlooking something.
      IIRC, compiz has (or at least used to have) an option to choose window decorations in same texture vs different textures. The latter would avoid the copy. There are some artifacts w/ wobbly windows if you do this (but then you can also just disable wobbly windows). Last time I checked compiz defaulted to decorations in same texture (ie. copy).
      Last edited by robclark; 06 March 2014, 09:07 AM.

      Comment


      • #63
        Originally posted by robclark View Post
        Well, technically you are right.. but really the only good example (from performance standpoint) of DDX driver is intel SNA, and really no one but intel can afford that massive DDX driver investment. Go compare the # of LoC of SNA to intel's mesa driver some day!
        Yes, the intel SNA driver is superb.
        But, AFAIK, it's only one person in charge of it - Chris Wilson.
        Is it too naive to think that for other companies than intel this investment is too big?

        Then again, it might be hard to get equally skilled developers like Chris,
        which really drives that thing with passion* trying to squeeze out every bit that's possible...
        [*] At least that's my impression following his git history.

        Comment


        • #64
          Originally posted by entropy View Post
          Yes, the intel SNA driver is superb.
          But, AFAIK, it's only one person in charge of it - Chris Wilson.
          Is it too naive to think that for other companies than intel this investment is too big?

          Then again, it might be hard to get equally skilled developers like Chris,
          which really drives that thing with passion* trying to squeeze out every bit that's possible...
          [*] At least that's my impression following his git history.
          Yeah, Chris does most of the work on SNA, but intel has a big enough team that they can afford one person to focus most of their time on it

          (and of course, Chris probably counts as more than one person :-P)

          Comment


          • #65
            Originally posted by Philip View Post
            In theory your kernel driver can maintain a queue of jobs and the ISR can immediately feed it a new one. The released code looks pretty dumb though - the kernel driver's ISR just wakes up a userspace thread that's waiting in an ioctl, so it'll be affected by random scheduler latency.
            Oh, actually I was being dumb - it really uses the 'optimized' v3d_opt.c driver and the BRCM_V3D_OPT codepath in the userland code, so it's doing more work in the kernel than I thought. (I guess the non-BRCM_V3D_OPT path was the earliest attempt to port the driver from the VPU to the ARM with minimal changes, and the BRCM_V3D_OPT path was added later to make it faster, and the non-BRCM_V3D_OPT path bitrotted. The result is a bit of a mess...)

            Comment


            • #66
              This is a good start but without architecture/ISA docs (or compilers/assemblers/etc) for the separate VideoCore processor on the Raspberry PI, this is not as usefull as it looks IMO.
              Also, with all the secret init stuff used for boot (e.g. RAM initialization) its still nowhere near possible to go "closed-source free" on the PI.

              I doubt this release will convince the likes of ARM, Qualcomm or Imagination to open up driver code (for e.g. ARM Mali, Qualcomm Adreno or Imagination PowerVR) though.

              Comment


              • #67
                Originally posted by jonwil View Post
                This is a good start but without architecture/ISA docs (or compilers/assemblers/etc) for the separate VideoCore processor on the Raspberry PI, this is not as usefull as it looks IMO.
                Also, with all the secret init stuff used for boot (e.g. RAM initialization) its still nowhere near possible to go "closed-source free" on the PI.
                sure, it doesn't do much for unlocking the rest of the functions of the videocore. But it sounds like an ARM side GPU driver is possible, so this is actually extremely useful.

                And just playing devil's advocate here, but I wonder how many people have the src code to their BIOS on their laptop/desktop? Not saying that we can all pack up and go home now, wrt. r-pi. The work on r/e the videocore must go on. And it seems like they even have found a few nice hints here and there in the code drop.

                But please don't underestimate the significance of opening up the GPU docs.

                Originally posted by jonwil View Post
                I doubt this release will convince the likes of ARM, Qualcomm or Imagination to open up driver code (for e.g. ARM Mali, Qualcomm Adreno or Imagination PowerVR) though.
                and why would the fact that there is still a closed bootloader really matter to the other GPU vendors?

                Comment


                • #68
                  Any news on the progress?
                  I couldn't find anything...

                  Comment


                  • #69
                    Originally posted by entropy View Post
                    Any news on the progress?
                    I couldn't find anything...
                    I haven't really been following the great race (ie. to-port existing code). I have chat w/ one of the folks thinking about the longer term solution of how to have a proper/secure upstream driver (regarding some of the memory protection challenges w/ the r-pi hw architecture), so it does seem to me that the folks working on this are thinking about the right things for a proper upstream solution. That makes me happy :-)

                    (I was a bit worried that the one-big-lump-sum thing would be counter-productive in the long run. I hope this is not the case, because the folks working on something which will actually be accepted upstream really deserve their share of the purse.)

                    Comment


                    • #70
                      Originally posted by robclark View Post
                      I haven't really been following the great race (ie. to-port existing code). I have chat w/ one of the folks thinking about the longer term solution of how to have a proper/secure upstream driver (regarding some of the memory protection challenges w/ the r-pi hw architecture), so it does seem to me that the folks working on this are thinking about the right things for a proper upstream solution. That makes me happy :-)

                      (I was a bit worried that the one-big-lump-sum thing would be counter-productive in the long run. I hope this is not the case, because the folks working on something which will actually be accepted upstream really deserve their share of the purse.)
                      That sounds promising, indeed.

                      Thanks, Rob.

                      Comment

                      Working...
                      X