Multi-Threading Cairo-Image For Better Performance
Chris Wilson of Intel's Open-Source Technology Center has shared some early performance results of an experimental multi-threaded software rasterizer for Cairo's image back-end that can accommodate multi-threading.
These early threaded cairo-image results were shared in a blog post by Chris Wilson entitled "You want threads, why not Zoidberg?" Øyvind Kolås out of Intel's OTC office in London, worked on vector renderer O for Cairo. This is an experimental rasterizer, but is showing promise, as noted by Chris.
Read more in Chris Wilson's blog post.
These early threaded cairo-image results were shared in a blog post by Chris Wilson entitled "You want threads, why not Zoidberg?" Øyvind Kolås out of Intel's OTC office in London, worked on vector renderer O for Cairo. This is an experimental rasterizer, but is showing promise, as noted by Chris.
To gain the most improvement from adding threads to cairo, you need to design a rasteriser and usage model with threading in mind in. One such design is the vector renderer O by Øyvind Kolås. Despite being an experiment, it does show quite a bit of promise, but in its raw form just throwing threads at the problem does not beat using the SIMD compositing routines provided by pixman. However, it did raise the question whether we can make improvements to the existing image backend without impacting upon its immediate mode nature and so could be used by existing applications without alteration. To preserve the existing semantics, we can break up the individual composite and scan conversion operations into small pieces and feed those to a pool of threads, and then wait for the threads to complete before returning back to the application. As such we then never run the threads for very long, and risk that the overhead in thread management outweighs any benefit from splitting the operation over multiple cores.In terms of Chris Wilson's benchmark results when comparing the threaded cairo-image, UXA with the Intel driver, and his experimental SNA acceleration architecture for the Intel driver, he concludes, "For the cases that are almost entirely GPU bound (for example the firefox-fishbowl, -fishtank, -paintball, -particles), we have virtually eliminated all the previous advantage that the GPU held. In a notable couple of cases, we have improved the image backend to outperform SNA, and for all cases now the threaded image backend beats UXA. However, as can be seen there is still plenty of room for improvement of the image backend, and we can’t let the hardware acceleration be merely equal to a software rasteriser..."
Read more in Chris Wilson's blog post.
5 Comments