Results 1 to 9 of 9

Thread: Cairo 1.12.4 Brings Worthwhile Changes

  1. #1
    Join Date
    Jan 2007
    Posts
    14,793

    Default Cairo 1.12.4 Brings Worthwhile Changes

    Phoronix: Cairo 1.12.4 Brings Worthwhile Changes

    Taking a break from his crazy activity on the Intel driver and SNA acceleration architecture, Chris Wilson released today Cairo 1.12.4. There are some worthwhile changes and new features to this release making it worth the upgrade...

    http://www.phoronix.com/vr.php?view=MTIwMDE

  2. #2
    Join Date
    Jan 2009
    Posts
    1,402

    Default

    I've seen the image backend of cairo used often as the baseline for performance comparisons but I still don't know what it is.

  3. #3
    Join Date
    Oct 2008
    Posts
    3,129

    Default

    Quote Originally Posted by liam View Post
    I've seen the image backend of cairo used often as the baseline for performance comparisons but I still don't know what it is.
    It's standard software rendering AFAIK. (no xrender, or GL acceleration, etc.)

  4. #4
    Join Date
    Jan 2009
    Posts
    1,402

    Default

    Quote Originally Posted by smitty3268 View Post
    It's standard software rendering AFAIK. (no xrender, or GL acceleration, etc.)

    Hmm, I thought xlib was software (hence the zero copy to gpu) while the xrender back end would have created the image on the gpu to start with?

  5. #5
    Join Date
    Jul 2007
    Posts
    404

    Default

    The xlib backend rasterizes on the CPU, but uses the GPU (via XRender and EXA) for filling, copying, and compositing (which tend to be more frequent operations). I think it may also use XRender for path rendering after tesselating to trapezoids (which isn't hw accelerated by any driver I know of now, but could be in theory).

  6. #6
    Join Date
    Jan 2009
    Posts
    1,402

    Default

    Quote Originally Posted by TechMage89 View Post
    The xlib backend rasterizes on the CPU, but uses the GPU (via XRender and EXA) for filling, copying, and compositing (which tend to be more frequent operations). I think it may atlso use XRender for path rendering after tesselating to trapezoids (which isn't hw accelerated by any driver I know of now, but could be in theory).

    That sounds right, but what is the difference with the image backend?

  7. #7

    Default

    Quote Originally Posted by liam View Post
    That sounds right, but what is the difference with the image backend?
    cairo-xlib tessellates the high-level paths from the user into trapezoids and sends those to the Xserver. The ddx then rasterises the trapezoids into a mask and composites that onto the destination. Both Nvidia and glamor use trapezoid shaders to avoid rasterising with the CPU, SNA uses the same high speed scanline rasteriser as cairo-image (both try to eliminate the intermediate mask), and EXA uses the slow pixman trapezoid rasterisation routines and the extra compositing step. (For -intel the CPU is faster at generating the RLE opacity mask and sending it as geometry to the GPU than the current GPUs are at executing the branch heavy trapezoid shader. The ultimate question is whether we can tolerate using MSAA and have GPUs sufficiently fast enough...)

    cairo-image rasterises directly from the general complex polygon computed for the path (convert the curves into straight lines, convolve with a pen etc). This essentially folds the two passes peformed by cairo-xlib into one and eliminates the very computationally expensive Bentley-Ottmann routine for tessellating trapezoids. On the downside, cairo-image only uses a single core (and no GPU offload) for its rasterisation. Also, more work can be done for cairo-image to process the path without requiring an intermediate polygonisation (e.g. walk splines within the scanline rasteriser, use a hairline renderer for thin pens, compute offset curves, etc).

    The next step to speed up cairo-xlib would be to eliminate the trapezoids and send paths directly to X - fix the protocol to be more useful for cairo, and also coincidentally would enable separate render threads within cairo. For Nvidia, they would then couple up their driver to use their existing NV_path acceleration, and I would do something similar for SNA (as usual, look at the early experiments in cairo-drm) if the GPU was not the bottleneck.

  8. #8
    Join Date
    Dec 2011
    Posts
    2,048

    Default Wayland

    But does it work on Wayland?

  9. #9
    Join Date
    Jan 2009
    Posts
    1,402

    Default Wow, I just came across this response.

    Quote Originally Posted by ickle View Post
    cairo-xlib tessellates the high-level paths from the user into trapezoids and sends those to the Xserver. The ddx then rasterises the trapezoids into a mask and composites that onto the destination. Both Nvidia and glamor use trapezoid shaders to avoid rasterising with the CPU, SNA uses the same high speed scanline rasteriser as cairo-image (both try to eliminate the intermediate mask), and EXA uses the slow pixman trapezoid rasterisation routines and the extra compositing step. (For -intel the CPU is faster at generating the RLE opacity mask and sending it as geometry to the GPU than the current GPUs are at executing the branch heavy trapezoid shader. The ultimate question is whether we can tolerate using MSAA and have GPUs sufficiently fast enough...)

    cairo-image rasterises directly from the general complex polygon computed for the path (convert the curves into straight lines, convolve with a pen etc). This essentially folds the two passes peformed by cairo-xlib into one and eliminates the very computationally expensive Bentley-Ottmann routine for tessellating trapezoids. On the downside, cairo-image only uses a single core (and no GPU offload) for its rasterisation. Also, more work can be done for cairo-image to process the path without requiring an intermediate polygonisation (e.g. walk splines within the scanline rasteriser, use a hairline renderer for thin pens, compute offset curves, etc).

    The next step to speed up cairo-xlib would be to eliminate the trapezoids and send paths directly to X - fix the protocol to be more useful for cairo, and also coincidentally would enable separate render threads within cairo. For Nvidia, they would then couple up their driver to use their existing NV_path acceleration, and I would do something similar for SNA (as usual, look at the early experiments in cairo-drm) if the GPU was not the bottleneck.

    Thanks so much for the clear and detailed explanation!
    Do you happen to know how Microsoft has managed to accelerate 2d operations so effectively with the gpu? As you point out, the branch heavy code seems as if it would be a problem for them as well (I'm assuming they don't use the cpu for that).

    Best/Liam
    Last edited by liam; 12-29-2012 at 06:34 PM.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •