Announcement

Collapse
No announcement yet.

Will H.264 VA-API / VDPAU Finally Come To Gallium3D?

Collapse
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Originally posted by yesterday View Post
    How the hell is MPEG-2 "free and open"? It is patented and licensed by MPEG-LA.

    MPEG-1 is basically MPEG-2 with lower resolution. It doesn't need hardware acceleration anyway.
    By the time webm/h264 gets acceleration is complete, these codecs won't need acceleration either. The future looks bright, indeed!

    Comment


    • #32
      Originally posted by popper View Post
      did you put your current alpha/beta prototype code on a github or do you intend to soon? and when do you assume your thesis will be done

      there's all those other bit's and peace's of openCL/Cuda code mentioned on the x264dev logs etc too, it's not clear if they do/cover other stuff besides your "I've managed to convert Subpixel Prediction (sixtap + bilinear), IDCT, Dequantization, and the VP8 Loop filter" routines yet , as no ones bothered to collect them up and list the github etc location's on a web page somewhere
      I've been pushing to github since I started work in late October:
      http://github.com/awatry/libvpx.opencl

      The bound copy of my thesis is due in 3 weeks, final draft 3/31 or 4/1, don't remember which .

      I've only had Nvidia hardware to test on since my Radeon 4770 doesn't support the byte_addressable_store extension (5000-series and up only), but it runs on my GF9400m and a GTX 480 in current Ubuntu just fine. It also works fine on AMD Stream CPU-based OpenCL. I've gotten it working in Mac OS using CPU CL, but there's a bug in the Mac GPU-based acceleration that kills it every time and I haven't had time to track it down yet.

      Like I said, I'm hoping to keep working on this after graduation, either as a hobby, or professionally if someone's willing to pay. I've gotten the OpenCL initialization framework in place, have all of the memory management taken care of, and have most of the major parts of the decoding available as CL kernels.

      The next step that needs to be done is increasing the parallelism, as I'm currently capping out at 336 threads max, and the common case is only a few dozen threads, not enough to even approach achieve performance parity with the CPU-only paths. I've figured out a few ways to do that, especially in the loop filter (which accounts for 50% or so of the CPU-only execution time on a few of the 1080p videos I've profiled ). The sub-pixel prediction/motion compensation and Dequantization/IDCT will take a bit more work to thread effectively, but I think it can be done.

      Comment


      • #33
        Originally posted by pingufunkybeat View Post
        Now we need Clover
        Why do you think I've been pushing for the GSoC proposal people who've been posting in the forums here to try to finish off the Clover state tracker?

        I'm sick of using the binary Nvidia drivers on my desktop/laptop, and I'd love to be able to switch back to the OSS drivers.

        Comment


        • #34
          If anyone interested, or would pick up this GSoC project, I do have some very early vaapi state_tracker code. I just got more important things to do, so I haven't touched it for a while. But the one doing the GSoC project, could get it if he/ she wants it.

          Comment


          • #35
            Originally posted by tball View Post
            If anyone interested, or would pick up this GSoC project, I do have some very early vaapi state_tracker code. I just got more important things to do, so I haven't touched it for a while. But the one doing the GSoC project, could get it if he/ she wants it.
            if your going to offer this or any other code then its always a good thing to put the github direct URL link in your post somewhere and put it on github if not already done so.

            then someone might make reference to it and encourage uptake and OC then there's always an off site backup if you loose your local hard drive with all that work on

            Comment


            • #36
              Originally posted by popper View Post
              ... and OC then there's always an off site backup if you loose your local hard drive with all that work on
              This... I feel a bit more comfortable knowing that I have a minimum of 7 identical copies of my thesis code spread across at least 5 physical locations.

              Comment


              • #37
                Originally posted by Veerappan View Post
                This... I feel a bit more comfortable knowing that I have a minimum of 7 identical copies of my thesis code spread across at least 5 physical locations.
                LOL i thought you might.

                by the way although it's no direct use for for the gfx code side, i noticed on one of Jason Garrett-Glaser latest ffmpeg VP8: optimization patches Diego Elio Pettenò flameeyes mentioned the pahole utility from acmel's dwarves is designed to find the cacheline boundaries in structures, dont know if it's any good for the CPU side, but worth mentioning anyway just in case.

                http://ffmpeg.org/pipermail/ffmpeg-d...ch/109377.html

                Comment


                • #38
                  Originally posted by popper View Post
                  LOL i thought you might.
                  I had a co-worker lose a drive last summer that wasn't backed up, and realized that other than a periodic external drive backup of my Mac and Windows partitions, I didn't have much of a system in place.

                  So now my desktop is running hardware RAID 1 with git checkouts in both Linux and Windows partitions, and my laptop has git checkouts of my stuff on all 3 of its operating systems (Win7, Mac, Linux). Both laptop and desktop are periodically backed up to external drives (separate drives for each system). Eventually, I'll probably store those drives in my desk at work, but for now they're on a shelf.

                  I've got a co-located server in another state, the github master repository, and a checkout on my work computer. My HTPC has a copy as well (also RAID 1), just to provide another machine to test on.

                  I know it's excessive, but I really don't want to try to use the "hard drive ate my homework" excuse. I knew people in undergrad who used that one, and it sounded lame even then.

                  As far as the cache-line software goes, it could come in handy for profiling the CPU decoder. The reference VP8 decoder does force alignment to certain boundaries on many of its structures, but I haven't seen any work on cache line boundary detection (it may have happened, I just haven't seen it).

                  Comment

                  Working...
                  X