On another note related to the slides, I'd like to see what they did for threading GL. I'm assuming they're using the newer GLX_ARG_create_context extension to make shared contexts, pre-assign them to any threads doing rendering, and then synchronizing draw calls to the display screen in the main thread. They might be doing something more funky and creative though.
Generally the steps I go through these days to get multi-threaded GL rendering to work go something like:
1) Create dummy context to get access to extensions
2) Create real context for device using ARB_create_context extension (requires GL 3, basically)
3) Kill dummy context
4) Create shared context for main display window (shared with context from step #2)
5) Create a per-thread context cache, generally just a TLS variable
6) Create a context pool to accelerate step 8 in the common case
7) Create several shared GL contexts to store in the context pool
8) Create a function to check that a context has been bound to the current thread, and if not, pull one off the context queue and bind it; signal main thread and block if the pool is empty, wait for main thread to create a new shared context that we can bind
9) Ensure that all threads that are ending return their cached context (if they have one) to the pool
10) Write letters to Khronos asking them to just give us explicit device, surface, and context objects like they promised for Longs Peak
The point of the separate context in step 4 is that you sometimes need to destroy and recreate your main window. Since OpenGL oh-so-wonderfully ties your device context (which controls the lifetime of your GPU objects) and the output window into a single object, there's no way to recreate an output window without also destroying all your textures, shaders, buffers, etc. Unless you create two shared contexts, which is a relatively newish feature and not yet supported everywhere (Mesa is only just getting support for it in 8.1, iirc). Again, I think it's obvious that the D3D approach is much superior here: separate objects for the device and swap chain, which are explicitly managed by the developer.
Generally the steps I go through these days to get multi-threaded GL rendering to work go something like:
1) Create dummy context to get access to extensions
2) Create real context for device using ARB_create_context extension (requires GL 3, basically)
3) Kill dummy context
4) Create shared context for main display window (shared with context from step #2)
5) Create a per-thread context cache, generally just a TLS variable
6) Create a context pool to accelerate step 8 in the common case
7) Create several shared GL contexts to store in the context pool
8) Create a function to check that a context has been bound to the current thread, and if not, pull one off the context queue and bind it; signal main thread and block if the pool is empty, wait for main thread to create a new shared context that we can bind
9) Ensure that all threads that are ending return their cached context (if they have one) to the pool
10) Write letters to Khronos asking them to just give us explicit device, surface, and context objects like they promised for Longs Peak
The point of the separate context in step 4 is that you sometimes need to destroy and recreate your main window. Since OpenGL oh-so-wonderfully ties your device context (which controls the lifetime of your GPU objects) and the output window into a single object, there's no way to recreate an output window without also destroying all your textures, shaders, buffers, etc. Unless you create two shared contexts, which is a relatively newish feature and not yet supported everywhere (Mesa is only just getting support for it in 8.1, iirc). Again, I think it's obvious that the D3D approach is much superior here: separate objects for the device and swap chain, which are explicitly managed by the developer.
Comment