Announcement

Collapse
No announcement yet.

OpenGL ES 3.0 Will Be Here This Summer

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • phoronix
    started a topic OpenGL ES 3.0 Will Be Here This Summer

    OpenGL ES 3.0 Will Be Here This Summer

    Phoronix: OpenGL ES 3.0 Will Be Here This Summer

    While OpenGL ES 3.0 has been speculated about for months, the specification will be formally released by the Khronos Group this summer...

    http://www.phoronix.com/vr.php?view=MTEwNzk

  • efikkan
    replied
    I have found, that in most cases games are not hitting a vram bandwidth bottleneck. This is relatively easy to check by overclocking the memory bus and checking the impact on the overall performance. Hint: check the relation between processing performance and memory bandwidth on the previous generation GPUs from nVidia.

    BTW: To me the core profile could have been removed. There is no performance gain in it. One mobile profile(ES) and a full profile should be enough.
    Last edited by efikkan; 05-31-2012, 03:17 PM.

    Leave a comment:


  • entropy
    replied
    Originally posted by efikkan View Post
    OpenGL implementations for both Windows and Linux support multiple contexts and context sharing, like the features of Direct3D. The drivers are however single threaded so OpenGL has no disadvantage here. Some news was posted some time ago regarding an nVidia patent for multiple streams to the GPU, so this might show up in future generations (post Maxwell). Anyway, how is the lack of "proper" multithreading slowing down the performance of applications? Do you really think 8 CPU cores could boost the performance of your GPU? When executing calls to the GPU, either in OpenGL or Direct3D, there are two types of calls: load and render. Load calls are bandwidth limited, so multiple threads executing load calls would not help the performance. For a single viewport, multiple render threads would not help much either. Modern style, optimized SM4+ code should be GPU performance limited, so multiple threads would not give any speedup here. So you are thinking, I want one thread for rendering and one for loading. Ideally then your performance speedup would equal the amount of GPU idle time saved. Depending on usage, this might give around 10-20% theoretical speedup. But it has one condition, the data your loading thread are modifying cannot be in use by the rendering thread. However, GK110 and CUDA5 introduces direct streaming to the GPU without the CPU. Hopefully this would be included in future OpenCL specifications.
    I shouldn't write in elanthis' behalf.
    My technical knowledge concerning this topic is rather limited.

    Nevertheless, this is what elanthis wrote in the forum.

    http://phoronix.com/forums/showthrea...004#post259004

    Originally posted by elanthis View Post
    The core code is likely identical in terms of acceleration.

    The differences between GL and D3D for performance stem from GL's state model, the difficulty of threading it properly, and the extra checks that have to be run on many object uses because of the highly mutable object model. The latter at least is being slowly fixed with each successive version of GL (e.g., ARB_texture_storage), but has a long way to go. The threading problem cannot be fixed without literally scrapping and redesigning the API, as the fundamental problem is that the API expects and requires magic global hidden state (which can be thread-local, but that is not free), and in the short term requires scrapping and redesigning WGL and GLX (the changes from GL3 made it much better, but still far from perfect). The GL state model is just utter shit and needs to be shot in the face six times with a high-powered rifle; there's no fixing it, simply throwing it away and starting over. The API is simply trash, and even Khronos knows that fact (hence the Longs Peak fiasco). They just aren't willing to do anything about it; they introduce things like Core profile that break back-compat in little minor ways that barely affects anything at all while refusing to just introduce a revised API that breaks things in larger but actually useful ways.

    The biggest problem with GL as an app developer is that -- on Windows -- the drivers are simply buggy and unstable. I still run into frequent driver crashes or just crazy performance problems that are simply bugs. The problems usually get fixed (though a few really bad long-term bugs haven't been fixed even after two years on NVIDIA's drivers) eventually, but the releases that fix one set of bugs inevitably just cause more.

    Don't even get me started on what a horrifically bad shading language GLSL is, either. It's only just becoming sane with GLSL 4.20, which means you can't actually use any of its features since most of us need to target GL 3.1 hardware (Intel) or GL 3.2 operating systems (OS X) or just stick to GLSL|ES 2.0 (iOS, Android, NativeClient).

    Leave a comment:


  • sylware
    replied
    Originally posted by efikkan View Post
    OpenGL implementations for both Windows and Linux support multiple contexts and context sharing, like the features of Direct3D. The drivers are however single threaded so OpenGL has no disadvantage here. Some news was posted some time ago regarding an nVidia patent for multiple streams to the GPU, so this might show up in future generations (post Maxwell). Anyway, how is the lack of "proper" multithreading slowing down the performance of applications? Do you really think 8 CPU cores could boost the performance of your GPU? When executing calls to the GPU, either in OpenGL or Direct3D, there are two types of calls: load and render. Load calls are bandwidth limited, so multiple threads executing load calls would not help the performance. For a single viewport, multiple render threads would not help much either. Modern style, optimized SM4+ code should be GPU performance limited, so multiple threads would not give any speedup here. So you are thinking, I want one thread for rendering and one for loading. Ideally then your performance speedup would equal the amount of GPU idle time saved. Depending on usage, this might give around 10-20% theoretical speedup. But it has one condition, the data your loading thread are modifying cannot be in use by the rendering thread. However, GK110 and CUDA5 introduces direct streaming to the GPU without the CPU. Hopefully this would be included in future OpenCL specifications.
    Well... I understood that you must be very carefull with the render task and the load task. Indeed, you must keep the vram bandwidth and the caches for the render to be the most efficient. Loading without the CPU means DMA, with shader programs and/or discret DMA engines... but the PCI-E controller on the GPU board would perform write memory requests and then disturb the scarse vram bandwidth and caches of the render task.

    Leave a comment:


  • efikkan
    replied
    Originally posted by entropy View Post
    IIRC, elanthis mentioned several times the limitations of OpenGL (being a state machine) concerning proper multithreading.
    OpenGL implementations for both Windows and Linux support multiple contexts and context sharing, like the features of Direct3D. The drivers are however single threaded so OpenGL has no disadvantage here. Some news was posted some time ago regarding an nVidia patent for multiple streams to the GPU, so this might show up in future generations (post Maxwell). Anyway, how is the lack of "proper" multithreading slowing down the performance of applications? Do you really think 8 CPU cores could boost the performance of your GPU? When executing calls to the GPU, either in OpenGL or Direct3D, there are two types of calls: load and render. Load calls are bandwidth limited, so multiple threads executing load calls would not help the performance. For a single viewport, multiple render threads would not help much either. Modern style, optimized SM4+ code should be GPU performance limited, so multiple threads would not give any speedup here. So you are thinking, I want one thread for rendering and one for loading. Ideally then your performance speedup would equal the amount of GPU idle time saved. Depending on usage, this might give around 10-20% theoretical speedup. But it has one condition, the data your loading thread are modifying cannot be in use by the rendering thread. However, GK110 and CUDA5 introduces direct streaming to the GPU without the CPU. Hopefully this would be included in future OpenCL specifications.

    Leave a comment:


  • smitty3268
    replied
    Originally posted by efikkan View Post
    How is the OpenGL API supposedly busted? (and why is Direct3D not?)

    Gallium3D adds another abstraction level with overhead and feature limitations. The API(OpenGL/Direct3D) is supposed to be the abstraction between hardware and the application, adding another level is just simply stupid.
    Elanthis has mentioned before he wants a more object-oriented API. Think C++ instead of C.

    I agree that Gallium doesn't really do anything here, though - maybe it's a quick way to prototype changes, but ultimately nothing is ever going to get adopted unless NVidia and AMD bye in, and they've shown pretty conclusively I think that they aren't willing to change GL radically. I don't see how the chances of them adopting a completely new replacement could possibly be any better.

    Leave a comment:


  • log0
    replied
    Originally posted by efikkan View Post
    How is the OpenGL API supposedly busted? (and why is Direct3D not?)

    Gallium3D adds another abstraction level with overhead and feature limitations. The API(OpenGL/Direct3D) is supposed to be the abstraction between hardware and the application, adding another level is just simply stupid.
    And who is going to write the abstraction for each driver? You?

    Gallium is just common/helper stuff shared by drivers, like OpenGL state tracker and resource management. It won't be as fast as a custom-tailored implementation. But it is not like there are 200+ developers to work on one in the first place.

    Leave a comment:


  • entropy
    replied
    Originally posted by efikkan View Post
    How is the OpenGL API supposedly busted? (and why is Direct3D not?)

    Gallium3D adds another abstraction level with overhead and feature limitations. The API(OpenGL/Direct3D) is supposed to be the abstraction between hardware and the application, adding another level is just simply stupid.
    IIRC, elanthis mentioned several times the limitations of OpenGL (being a state machine) concerning proper multithreading.

    Leave a comment:


  • efikkan
    replied
    Originally posted by elanthis View Post
    No. Because the OpenGL API is busted at the very core of its API design.

    You cannot incrementally fix it. You need to toss it out and start over again. Gallium3D makes this relatively easy for the community to pull off, I might note. *nudge*
    How is the OpenGL API supposedly busted? (and why is Direct3D not?)

    Gallium3D adds another abstraction level with overhead and feature limitations. The API(OpenGL/Direct3D) is supposed to be the abstraction between hardware and the application, adding another level is just simply stupid.

    Leave a comment:


  • Prescience500
    replied
    Originally posted by elanthis View Post
    No. Because the OpenGL API is busted at the very core of its API design.

    You cannot incrementally fix it. You need to toss it out and start over again. Gallium3D makes this relatively easy for the community to pull off, I might note. *nudge*
    That I knew about OpenGL. I was under the impression that OpenGL ES was made almost form scratch though to be better optimized. I guess, from what you said, that's not the case though.

    Leave a comment:

Working...
X