Announcement

Collapse
No announcement yet.

Trinity APU memory layout?

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #11
    Originally posted by bridgman View Post
    With every passing hour I think of another possible way to interpret your question

    When you say "have to go through OpenGL" are you using OpenGL for most of the drawing (ie you want to go *around* OpenGL to update textures but then keep drawing with OpenGL and the updated textures) or are you asking if you can program the 3D hardware directly and do cool texture update things without using OpenGL at all ?

    If the former, previous answer stands (you can do it but you need some API extensions in OpenGL to deal with things like establishing addressibility and cache flushing to pick up the new contents from memory).

    If the latter, the answer is "yes", but you'd probably want to implement some kind of 3D API yourself anyways.
    I geuss the real question is, how much abstraction can be removed from every level of the stack to increase performance without burdening developers. The old adage, clock cycles are cheaper then developer man hours.

    I know D#D in its many incarnations has started to become very problematic performance wise. I often ponder the wisdom of gallium3d doing so many code transformations etc. I geuss what needs to come out, is a way to get as close to the metal as can be achieved, which in my mind would mean making a kernel/compiler right into the hardware "as a hardware feature" and make it intellegent enough that drivers would become nearly obsolete.

    buts thats just a pipe dream anyways

    Comment


    • #12
      Originally posted by bridgman View Post
      With every passing hour I think of another possible way to interpret your question
      Indeed.

      I just typed four different responses and after each iteration, was less and less certain of what I saw and what I'm actually asking. I'm fairly confident that what was presented was the updating of surfaces via the modification of the discrete memory of the GPU.

      The "AMD_pinned_memory" extension looks close to what I saw, with the exception that the modifications are made to a page on system memory, and not discrete memory. If your extension coincidently results in a four-orders-of-magniture increase in resurfacing performance, then it is entirely possible that this is what I saw and the presenter misspoke or was unclear. If this is the case, you need to call marketing immediately, as this is ground-breakingly awesome.

      I'll see if I can fire off an e-mail to the presenter. I also need to ping corp-legal to see if there's an NDA between the two companies which I somehow inherit by virtue of my employment arrangement.

      F
      Last edited by russofris; 16 May 2012, 10:36 PM.

      Comment


      • #13
        Other thoughts.

        Re: http://www.opengl.org/registry/specs...ned_memory.txt

        This really needs a touchup by an english speaking tech writer. Trivial fixes for stuff like "As an example, consider the following example".

        Even though the spec deals with (slower) main memory, the implications are really neat, and I'm surprised AMD hasn't released one of their half-naked-3D-woman tech demos to feature it. This is at least as big for textures/surfaces as tessellation is for polygons.

        F

        Comment


        • #14


          Slide 22-24 also look related, and neat.

          Also looked at


          Last edited by russofris; 17 May 2012, 12:18 AM.

          Comment


          • #15
            Originally posted by russofris View Post
            Other thoughts.

            Re: http://www.opengl.org/registry/specs...ned_memory.txt

            This really needs a touchup by an english speaking tech writer. Trivial fixes for stuff like "As an example, consider the following example".

            Even though the spec deals with (slower) main memory, the implications are really neat, and I'm surprised AMD hasn't released one of their half-naked-3D-woman tech demos to feature it. This is at least as big for textures/surfaces as tessellation is for polygons.

            F
            All depends on where the bottleneck is. If the operations of updating texture information over the PCIe bus is the bottleneck, then it's likely that pinned memory will really help out. I would actually like to test some of my own "megatexture" type stuff with that - 32k*32k images updated on the fly with persistent effects (weather and vegetation patterning across terrain), but it's a little far down on the list right now.

            Comment


            • #16
              According to Anandtech, pinned memory is bad (one extra memcpy), and with Trinity (IOMMUv2) is no longer needed:

              IOMMU v2 is also supported by Trinity, giving supported discrete GPUs (e.g. Tahiti) access to the CPU's virtual memory. In Llano, you used to take data from disk, copy it to memory, then copy it from the CPU's address space to pinned memory that's accessible by the GPU, then the GPU gets it and brings it into its frame buffer. By having access to the CPU's virtual address space now the data goes from disk, to memory, then directly to the GPU's memory—you skip that intermediate mem to mem copy. Eventually we'll get to the point where there's truly one unified address space, but steps like these are what will get us there.

              Comment

              Working...
              X