Multi-Threading Cairo-Image For Better Performance

Posted by Michael Larabel on January 26, 2013

Chris Wilson of Intel's Open-Source Technology Center has shared some early performance results of an experimental multi-threaded software rasterizer for Cairo's image back-end that can accommodate multi-threading.

These early threaded cairo-image results were shared in a blog post by Chris Wilson entitled "You want threads, why not Zoidberg?" Øyvind Kolås out of Intel's OTC office in London, worked on vector renderer O for Cairo. This is an experimental rasterizer, but is showing promise, as noted by Chris.
To gain the most improvement from adding threads to cairo, you need to design a rasteriser and usage model with threading in mind in. One such design is the vector renderer O by Øyvind Kolås. Despite being an experiment, it does show quite a bit of promise, but in its raw form just throwing threads at the problem does not beat using the SIMD compositing routines provided by pixman. However, it did raise the question whether we can make improvements to the existing image backend without impacting upon its immediate mode nature and so could be used by existing applications without alteration. To preserve the existing semantics, we can break up the individual composite and scan conversion operations into small pieces and feed those to a pool of threads, and then wait for the threads to complete before returning back to the application. As such we then never run the threads for very long, and risk that the overhead in thread management outweighs any benefit from splitting the operation over multiple cores.
In terms of Chris Wilson's benchmark results when comparing the threaded cairo-image, UXA with the Intel driver, and his experimental SNA acceleration architecture for the Intel driver, he concludes, "For the cases that are almost entirely GPU bound (for example the firefox-fishbowl, -fishtank, -paintball, -particles), we have virtually eliminated all the previous advantage that the GPU held. In a notable couple of cases, we have improved the image backend to outperform SNA, and for all cases now the threaded image backend beats UXA. However, as can be seen there is still plenty of room for improvement of the image backend, and we can’t let the hardware acceleration be merely equal to a software rasteriser..."

Read more in Chris Wilson's blog post.

Discuss this article in our forums, IRC channel, or email the author. You can also follow our content via RSS and on social networks like Facebook, Identi.ca, and Twitter (@Phoronix and @MichaelLarabel). Subscribe to Phoronix Premium to view our content without advertisements, view entire articles on a single page, and experience other benefits.
  1. Computers
  2. Display Drivers
  3. Graphics Cards
  4. Motherboards
  5. Peripherals
  6. Processors
  7. Software
  8. Operating Systems
  9. All Articles
  1. Linux Benchmarking
  2. OpenBenchmarking.org
  3. Phoronix Test Suite