No announcement yet.

Gallium3D OpenGL 4.1 State Tracker Redux

  • Filter
  • Time
  • Show
Clear All
new posts

  • Gallium3D OpenGL 4.1 State Tracker Redux

    Phoronix: Gallium3D OpenGL 4.1 State Tracker Redux

    There was a Gallium3D OpenGL 4.1 State Tracker proposed for this year's Google Summer of Code to benefit X.Org / Mesa. As this state tracker was going to be written from scratch and without any dependence on Mesa itself, the consensus among the core developers was that the work was simply too ambitious for a lone student developer to complete over the course of a summer. A new proposal has now been drafted by Denis Steckelmacher, the Belgian student developer interested in open-source OpenGL 4.1 support...

  • #2
    Micheal, you sound like working on a hw driver is a bad thing.

    Marek has a point there but I can't fully agree, features do matter.
    For example I'm a bit tired of the edges I get with even compiz not to mention anything more serious. I'd really appreciate some MSAA in r600g!
    Another usual problem is video decoding but we all know that by now.

    Denis could still continue Zack's work on the opencl state tracker, clover. Having a fast and reliable architecture for GPGPU stuff would be a nice addition for the HPC capabilities and desktops would make good use of it soon, I think.

    Just saying...


    • #3
      People should work on what they want to work on. It's Open Source, not a company. He only gets paid if he completes the work, so if he doesn't pull through, it's his loss and nobody else's.

      That opinion aside...

      Personally, of all things at this point, I'd rather just see a whole new damn graphics API built over Gallium 3D. Fuck OpenGL and DirectX both.

      Give me the cleanliness of DirectX 10/11's API with the portability of an OS-neutral C API, using a cleaned up version of Cg's shader language and semantics, and a usable utility library like DirectX's for shader/texture loading and the like.


      /* create and populate an attribute buffer (example using convenience set data command) */
      ngl3Buffer* buffer = ngl3BufferCreate(ctx, sizeof(Vertex) * nVertices, NGL_BUFFER_STREAM, NGL_BUFFER_WRITE);
      ngl3BufferSetData(buffer, aVertices, sizeof(Vertex) * nVertices);
      /* create and populate an index buffer (example using explicit lock, copy, unlock commands) */
      ngl3Buffer* indices = ngl3BufferCreate(ctx, sizeof(uint16_t) * nIndices, NGL_BUFFER_STREAM, NGL_BUFFER_WRITE);
      map = ngl3BufferLock(indices, NGL_BUFFER_WRITE);
      memcpy(map, aIndices, sizeof(uint16_t) * nIndices);
      ngl2BufferRelease(indices, map);
      /* create and popular a constant buffer */
      ngl3Buffer* constants = ngl3BufferCreate(ctx, sizeof(float) * 16, NGL_BUFFER_ONEFRAME, NGL_BUFFER_WRITE);
      ngl3BufferSetData(buffer, gProjectionMatrix, sizeof(float) * 16);
      /* define an attribute layout */
      ngl3Layout* layout = ngl3LayoutCreate(ctx, 3);
      ngl3LayoutAddSemantic(layout, NGL_FLOAT3, "POS0", buffer);
      ngl3LayoutAddSemantic(layout, NGL_FLOAT3, "COLOR0", buffer);
      ngl3LayoutAddSemantic(layout, NGL_FLOAT2, "TEXCOORD0", buffer);
      /* define a constant buffer layout */
      ngl3Layout* clayout = ngl3LayoutCreate(ctx, 1);
      ngl3LayoutAddSlot(clayout, NGL_FLOAT4x4, 0, constants);
      /* define a render target */
      ngl3Target* target = ngl3TargetCreate(ctx);
      /* load shaders (use low-level API -- you should have a standard higher-level effects/technique API as well!) */
      ngl3Program* program = ngl3ProgramCreate();
      ngl3ProgramLoadFile(program, "foo.vs", NGL_VERTEX_3_0);
      ngl3ProgramLoadFile(program, "", NGL_GEOMETRY_3_0);
      ngl3ProgramLoadFile(program, "foo.fs", NGL_FRAGMENT_3_0);
      /* render primitives */
      ngl3Render* cmd = ngl3RenderCreate(ctx);
      ngl3RenderSetTarget(cmd, target);
      ngl3RenderSetConstants(cmd, constants);
      ngl3RenderSetAttributes(cmd, layout);
      ngl3RenderSetIndices(cmd, indices, nIndices);
      ngl3RenderSetProgram(cmd, program);
      ngl3RenderSetType(cmd, NGL_TRIANGLES);
      The objects can be reused as much as possible. They're typed, so none of that OpenGL numeric identifier what-was-this-object-again bullshit. Objects that make sense to be write-only are write-only (like layouts). Those small objects that you may create a lot of are just backed by a simple pool allocator.

      No global state machine. You have to create objects using an explicit context. You can have as many contexts as you want, create different contexts for different threads, use one for rendering and another some other off-screen purposes, whatever you want.

      A render state command object instead of a shitload of redundant draw commands like OpenGL. Anyone actually writing an OpenGL renderer just wraps the stupid OpenGL draw commands behind a render state object anyhow, because sometimes you want to enable/disable various bits of render state and nobody wants to have to do a huge if (foo) DrawFoo() else (bar) DrawFoo() else ... every time they want to draw something. As a bonus, render command objects can be added to a render list which can be stored or passed off to another context later for actual rendering, which is what you need for multi-threaded rendering. (You create and populate buffers in threads that are doing your scene composition and culling, those create render command lists, pass those back to the main thread, and the main thread then executes the render lists.)

      Everything uses a typed accessor to make the API simpler. Direct3D uses parameter structs, but you have to be careful to clear them and use them properly, while using proper objects with accessors removes the margin for error. This is not a significant source of overhead, as you are only calling a few dozen functions per pass at most; the function call overhead won't even show up on a profiler on any significant application (e.g., anything besides micro-benchmarks).

      All of these functions are actual well-defined extern functions in a header file, not function pointers loaded up by some magic crappy backend. Why? IntelliSense. Code completion engines in all popular code editors get confused by OpenGL. (Note that having a C++ API is even better in this case, as then the code completion lets a user see which methods exist for any particular object just by typing object->[ctrl-space]).

      The 3 in the API calls is just an example of how there should be versioning. Don't do things like OpenGL. The API that makes sense on DirectX 9 class hardware does not make sense for DirectX 11 class hardware, nor for GLES class hardware.

      Oh, I didn't add an example of textures, but just to clarify on that: be sure that you separate texture and sampler objects (both OpenGL and DirectX got this wrong, then both half-ass fixed it), and that you be sure to include OpenGL's support for deferred mipmap generation.

      Include standardized convenience APIs wherever it makes sense. High-level effects stuff makes sense. Loading data from memory, streams, and files makes sense (potentially using your own stream API just because there's so many different stream APIs in low-level libraries that may need wrapping, like FILE* vs iostream and so on; and because MSVC's debug mode means you can't mix release-mode DLL's and debug-mode DLL's if they use stream-based APIs).

      Also note that the API is designed to be easily bindable by C++ or scripting languages, unlike OpenGL which is near impossible to bind (efficiently) due to the crazy-ass global state machine. A simple header-only C++ wrapper should be defined as part of the API standard so people on C++-friendly platforms don't have to gouge out their eyes.

      I have multiple professional game developers working on real engines begging for such an API (or just begging for non-DirectX platforms to disappear... but we don't want that). I'm willing to bet that, if implemented, documented, and covered by a test suite, such an API could get enough support to even get the big vendors to implement it in their drivers. (And in the short term, on Windows at least, such an API could just wrap DirectX -- wrapping OpenGL is near impossible though due to the global state machine semantics and the lack of multi-thread capabilities.)

      using ngl3;
      Buffer* buffer = ctx->CreateBuffer(...);
      // etc.
      Damn I want this API to exist.

      It's close to what my OpenGL wrapper looks like, but that is limited by all the limitations of OpenGL (including being stuck with GLSL or wrapping up the non-Open Cg framework).


      • #4
        Originally posted by elanthis View Post
        Personally, of all things at this point, I'd rather just see a whole new damn graphics API built over Gallium 3D. Fuck OpenGL and DirectX both.

        Implementing this on gallium3D would remove the need to wrap it to d3d on windows.


        • #5
          Know how

          to resovle S3TC patent problem!

          Buy out S3!


          • #6
            Floating point textures are more interesting than S3TC (yes, both are important).

            Good luck buying out Microsoft, though.


            • #7
              would a new API be able to circumvent the patent problems of OpenGL and how big of a task would be for someone to create and maintain something like that???

              Also isn't Floating point textures a pure software patent valid only to the US?? Someone -i think dave airlie- mentioned that S3TC is hw/sw related so its kind of different.


              • #8
                Originally posted by 89c51 View Post
                would a new API be able to circumvent the patent problems of OpenGL and how big of a task would be for someone to create and maintain something like that???
                A new API wouldn't solve anything, because the patents are in the algorithms themselves.

                As for the amount of work, it's doable if you layer it on top of OpenGL or D3D (that's what 3d engines do). It's impractical if you wish to communicate with the hardware directly (without going through GL/D3D).


                • #9
                  I can't imagine it would be too terribly difficult to write a new API on top of G3D. The question is, given the lack of proprietary support, could you get anyone to use it?


                  • #10
                    Originally posted by TechMage89 View Post
                    I can't imagine it would be too terribly difficult to write a new API on top of G3D. The question is, given the lack of proprietary support, could you get anyone to use it?
                    Most probably yes... The only missing bit is good drivers and Gallium3D not being widespread yet...

                    It wouldn't solve the patent issues, that's a different problem.


                    • #11
                      (last post on this topic, I'm wasting waaaaay too much time writing forum posts instead of getting stuff done.)

                      Originally posted by TechMage89 View Post
                      I can't imagine it would be too terribly difficult to write a new API on top of G3D. The question is, given the lack of proprietary support, could you get anyone to use it?
                      If the new API is well-designed, well-documented, covered by an extensive test suite, and has a working sample implementation with hardware acceleration, I bet you could get support from NVIDIA/AMD.

                      The big companies aren't any more fond of OpenGL than the developers are. It's a horrific API that adds a huge amount of overhead and complexity to driver implementations. Just the whole object naming (the integer name and generation/deletion crap) scheme in OpenGL is a gigantic pain in the ass, for everyone.

                      It would need to gain clout, yes. I can do my part with that, given the network of people I know since starting out in the game industry here in Seattle. As with everything in life, it's not about what you know but who you know...

                      It would be years before it'd be a serious contender, certainly. Years after it was a stable API, that is. Maybe even close to a decade.

                      That is NOT a reason to give up on the idea, though. Think of it this way: if the API exists, it might be 10 years before it's commonly used. if it doesn't exist, it won't ever be used, not even 100 years from now.

                      So the sooner it exists, the sooner it can get out there.

                      Once a working draft of the API is available, I can and will do what I can to get feedback from the guys at NVIDIA and AMD on it, as well as feedback from real game engine developers. Feedback to help improve any rough edges that would make it a problem to implement for the driver folks or to be used by real game engines.

                      In the short term, it can be wrapped around D3D relatively easily (the shader compiler would need to be able to target HLSL, that's about the hardest part), and used in live code without hardware vendor support. You can't possibly wrap OpenGL though with all the features the new API needs; the OpenGL API is too poorly designed due to the magic global secret juju sauce state machine crap. Plus it's impossible to get even required SM 1.0 behavior on OpenGL until GLSL 4.10 due to how it took Khronos took so freaking long to add input/output semantics to GLSL, which would mean you would only be able to half-ass wrap OpenGL on high-end recent hardware and not wrap it at all on anything older (including GLES).

                      For an example of what the shader semantic stuff means, take a look at an HLSL/Cg shader. You see things like:

                      struct VertexInput {
                        float4 pos : POSITION;
                        float4 color : COLOR0;
                        float4 lighting : TEXCOORD0;
                      See, the hardware has a limited number of input/output registers. Many of these have pre-defined "semantics" or purposes. Many of those semantics are no longer relevant on modern hardware; they are a throwback to the days when there were built-in color attributes and texture coordinate attributes in fixed-function hardware pipelines. But some of those things are still important, like vertex position, because there are still fixed-function portions in modern hardware that relies on that (like early-Z filtering that runs before the fragment shader). So hardware has a number of input/output registers that are pre-allocated to fixed-function purposes that general purpose attributes won't be placed in by default.

                      The above code in HLSL is using one of the texture coordinate registers to store lighting information. The shader doesn't need that texcoord register so it's saving space by reusing that for another purpose. That means there's more registers for more data to be passed through (and the attribute register pressure can get tight even on modern high-end cards), and some hardware potentially will just run faster if less user-semantic attribute registers are used.

                      You can do the same thing in GLSL just fine, technically. The problem is that GLSL doesn't use semantics attached to user-defined variables but instead uses global magic variable names. HLSL separates the semantic from the variable name, so the name you use in the rest of your shader is more logical. So the above code converted to GLSL would look something like:

                      attribute vec4 gl_TexCoord0; // not technically legal
                      So now the rest of your code is accessing gl_TexCoord0 for lighting information instead of using a logical name. That sucks. You can copy it to another name and use up a temp register, but that's wasteful.

                      The real problem shows up with effect composition systems. You have snippets of shader code that are pieced together to generate a final shader for a technique. These snippets just want to say "I use the lighting input variable" independent of which register it was passed in. With GLSL though, the variable name is directly dependent on the register used for passing it. So you can no longer have generic shader snippets independent of the interface used between the application and the shaders.

                      At best you can use the C-preprocessor support in GLSL (ugh) to do something like:

                      #define LightInfo gl_TextCoord0
                      But there are some downsides to that, and it just makes your code generator much more complex. You now need to both define variables in your inputs and define macros before all of that, and you need to add black magic to avoid trying to use the hard-coded names in any interfaces since you can't technically redefine built-ins.

                      GLSL 4.10 finally added semantics to the language... only they're kind of obtuse to use (purely numeric, no names), and of course they're only even available on OpenGL 4 class hardware due to being delivered with all the other GLSL changes for OpenGL 4.1 instead of being a language feature available since OpenGL 2 / GLSL 1. Le sigh.

                      And then there's other grammar issues with GLSL. Like the way you must put fragment and vertex and geometry shaders in different files, since the entry point for each is called main(). So that's kind of annoying, because the shaders are tightly tied together, and need to define the same input/outputs to work. Oh, and let's not forget how input/outputs are defined as global variables in GLSL instead of being in/out parameters to the entry point functions. So the same variable has to be defined differently between the two shader types, because it's an out global in the vertex shader and an in variable in the fragment shader. Compare to HLSL/Cg where you just define a structure for the whole set of inputs/outputs and simply return the struct from the vertex shader entry point function and take it as a parameter in the fragment shader function (this is of course optimized away, you're not actually copying chunks of data around; the point is the language semantics are clearer and easier to work with in composition systems).

                      And then you have to deal with different hardware profiles. I might want to write a shader that uses geometry shaders for extra oomph on DX10 class hardware but just skips that stuff on DX9 class hardware. In GLSL, that means writing more files. In something like Cg, you can just add an attribute to a function to denote the profile it's intended to work on, and you can overload functions by profile. So you can have a calculateLighting() function specified twice in the same file, one that relies on some geometry shader to be run first and one which doesn't, and everything Just Works(tm) the way you'd expect.

                      The only way to use GLSL sanely is to wrap it with a better language that just translates into GLSL. Which is exactly what Cg is for. But Cg is proprietary.

                      So ideally, a new API is a lot less about the actual API (which does still need work to get rid of the global state machine, add threading support, and use separate opaque struct types for separate object types) and a lot more about a new shading language. You need both, but the sins of the OpenGL API are a lot less than the sins of the GLSL language.

                      The first steps would probably just be to draft up the new shading language grammar, get a basic parser in place (modify the existing GLSL parser in Mesa, most likely), and then hook that up as an OpenGL extension as well as writing a library that can translate the new language into various flavors of GLSL.

                      Doing the new API is easy, but it's not useful without the shading language. The shading language is the hard part. Knock that out of the way (at least as a draft implementation) and the rest can fall into place a lot easier.

                      Sorry for all the text. This idea of a new API has been floating around a lot lately here in Redmond. OS X and the mobile space is getting pretty popular for game developers, but pretty much every single person I've talked to hates OpenGL with a passion. And no, it's not familiarity; many of the students I've talked to are taught OpenGL exclusively and end up preferring Direct3D anyway because it's just flat out easier to use as soon as you get past the "draw a few colored triangles on the screen" point.

                      @BlackStar: there's no reason it's only feasible as a wrapper. That's half the point of Gallium3D. It separates the hardware-specific drivers from the state trackers (API implementations). Hell, there's already partial implementations of Direct3D 9 and 10 in the Gallium3D source tree, which have no reliance on OpenGL at all (and obviously no reliance on Direct3D since it doesn't exist on Linux).

                      @89c51: no, the patents cannot be circumvented by a new API. Floating point patents may be US-only, but who cares? If it isn't allowed in the US it's worthless to anyone but a handful of foreign hobbyists and dead-end small-fry companies. The US is way too huge of a consumer market to ignore for anyone else. It would take relatively little effort to build the draft for a new API. The hard parts are all taken care of by Gallium3D. The shader compiler is the hardest part of a new API, but as someone with a background in compilers and languages, I will note that it's really not that complex to build a basic compiler that just needs to target an existing well-tested intermediate representation like TGSI. Getting the API up to a 1.0 level with the proper research and support for the current popular hardware profiles (GLES, DX9c, DX10) and getting the necessary level of documentation and test suite coverage would take much more time, and more eyeballs (and the right kinds of eyeballs). If I wasn't already committed to another project for the summer, I'd be tempted to go into GSOC myself (or just see if my university would fund it) and get the core foundation of the project knocked out. As it is, about all I'd have time for is drafting up a proposal API and shading language grammar... but I've already got a ton of other things on my plate for free time projects, and I'm trying to figure out which ones to cut (which sadly is likely to be just about everything Linux-related), not add more of them.


                      • #12
                        I wish anyone good luck with it.

                        OpenGL 2.1 for legacy
                        OpenGL ES 1 and 2 for embedded and compatibility and embedded legacy
                        This new proposal for the future

                        Combined with Gallium3D, KMS and SVG this could be great for Linux.

                        I'd say screw OpenGL > 2.1 because nobody is going to use that anyway...


                        • #13
                          Originally posted by V!NCENT View Post
                          I wish anyone good luck with it.

                          OpenGL 2.1 for legacy
                          OpenGL ES 1 and 2 for embedded and compatibility and embedded legacy
                          This new proposal for the future

                          Combined with Gallium3D, KMS and SVG this could be great for Linux.

                          I'd say screw OpenGL > 2.1 because nobody is going to use that anyway...
                          I'm pretty sure the Unigine folk are using OpenGL 3.x/4.x.
                          I know that I'm using OpenGL 3.x.


                          • #14
                            Originally posted by mirv View Post
                            I'm pretty sure the Unigine folk are using OpenGL 3.x/4.x.
                            I know that I'm using OpenGL 3.x.
                            Well that moves the legacy count up to OpenGL 3.x


                            • #15
                              Originally posted by V!NCENT View Post
                              Well that moves the legacy count up to OpenGL 3.x
                              And there I agree. I think it's easier to write a 3.x renderer with 4.x alternatives if the drivers support it, however, than have a completely separate 2.x, fixed function pipeline renderer.