Announcement

Collapse
No announcement yet.

AMD's UVD2-based XvBA Finally Does Something On Linux

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Hello,

    My hardware setup:
    P4 2.6 GHz HT (s478)
    ATI HD3450 AGP
    1 GB RAM (DDR1)

    I'm trying to play H264 movies with this setup, up to 1080p. CPU simply isn't fast enough to do this so for last three days I've been trying to make some use of GPU acceleration.

    I was kinda green in the matter so I started with reading/learning and from what I have read my GPU card has UDV (not UDV2) but it's working OK for some people.

    For example I found this solution:
    http://oscarbg.blogspot.com/2009/11/...i-backend.html

    So at this point I knew I have to use xvba + vaapi and some player that can use it, preferrably old good mplayer.

    I compiled mplayer with vaapi support plus all required stuff. When I run vaapi it gives me such output:

    Code:
    libva: libva version 0.31.1-sds1
    Xlib:  extension "XFree86-DRI" missing on display ":0.0".
    libva: va_getDriverName() returns 0
    libva: Trying to open /usr/lib/va/drivers/fglrx_drv_video.so
    libva: va_openDriver() returns 0
    vainfo: VA API version: 0.31
    vainfo: Driver version: Splitted-Desktop Systems XvBA backend for VA API - 0.7.4
    vainfo: Supported profile and entrypoints
          VAProfileMPEG2Simple            : VAEntrypointIDCT
          VAProfileMPEG2Main              : VAEntrypointIDCT
          VAProfileH264High               : VAEntrypointVLD
          VAProfileVC1Advanced            : VAEntrypointVLD
    Looks ok to me.

    So I tried to run some H264 video files with compiled mplayer:

    mplayer -vo vaapi:gl -va vaapi video.m2ts
    mplayer -vo vaapi -va vaapi video.m2ts

    I tried all of this using Ubuntu 9.04, 9.10 and 10.04 plus newest Catalyst Drivers (10.9).

    The result?

    @Ubuntu 9.04 and 9.10 I got mplayer crashing whole system after 1-2 seconds, I wasn't even able to see the picture (black sreen only).

    @Ubuntu 10.04 I got mplayer running for 1-2 seconds and then it freezes, black screen. The system doesn't crash so I'm able to kill mplayer but then it eventually crashes anyway.

    Of course mplayer gives me some output - I can see VA-API Acceleration is being used.

    When it's not used mplayer doesn't crash and movies are played normally (up to 720p).

    I've tried to read this topic (well, it's got 93 pages, I've read about 50% so far) and google for another answers but no luck. Since I've lost 3 days already for now I'd only like to know one thing - am I doing something that's not possible to accomplish with my hardware setup or software? Are all these success reports false or is it just my badluck? Maybe I have to use another versions of software? Could anyone post the versions that work together? Perhaps older Catalyst? :/

    I also tried to compile vlc with --fmpeg-hw option (./configure --enable-libva --without-kde-solid) but I got errors during ./compile:

    Code:
    ERROR   : avio.c: 55:  expected specifier-qualifier-list before 'URLContext'
    avio.c: In function 'OpenAvio':
    ERROR   : avio.c: 78:  implicit declaration of function 'av_register_all'
    ERROR   : avio.c: 96:  implicit declaration of function 'url_open'
    (..much more here, all with avio.c..)

    Comment


    • Hello,

      My hardware setup:
      P4 2.6 GHz HT (s478)
      ATI HD3450 AGP
      1 GB RAM (DDR1)

      I'm trying to play H264 movies with this setup, up to 1080p. CPU simply isn't fast enough to do this so for last three days I've been trying to make some use of GPU acceleration.

      I was kinda green in the matter so I started with reading/learning and from what I have read my GPU card has UDV (not UDV2) but it's working OK for some people.

      For example I found this solution:
      http://oscarbg.blogspot.com/2009/11/...i-backend.html

      So at this point I knew I have to use xvba + vaapi and some player that can use it, preferrably old good mplayer.

      I compiled mplayer with vaapi support plus all required stuff. When I run vaapi it gives me such output:

      Code:
      libva: libva version 0.31.1-sds1
      Xlib:  extension "XFree86-DRI" missing on display ":0.0".
      libva: va_getDriverName() returns 0
      libva: Trying to open /usr/lib/va/drivers/fglrx_drv_video.so
      libva: va_openDriver() returns 0
      vainfo: VA API version: 0.31
      vainfo: Driver version: Splitted-Desktop Systems XvBA backend for VA API - 0.7.4
      vainfo: Supported profile and entrypoints
            VAProfileMPEG2Simple            : VAEntrypointIDCT
            VAProfileMPEG2Main              : VAEntrypointIDCT
            VAProfileH264High               : VAEntrypointVLD
            VAProfileVC1Advanced            : VAEntrypointVLD
      Looks ok to me.

      So I tried to run some H264 video files with compiled mplayer:

      mplayer -vo vaapi:gl -va vaapi video.m2ts
      mplayer -vo vaapi -va vaapi video.m2ts

      I tried all of this using Ubuntu 9.04, 9.10 and 10.04 plus newest Catalyst Drivers (10.9).

      The result?

      @Ubuntu 9.04 and 9.10 I got mplayer crashing whole system after 1-2 seconds, I wasn't even able to see the picture (black sreen only).

      @Ubuntu 10.04 I got mplayer running for 1-2 seconds and then it freezes, black screen. The system doesn't crash so I'm able to kill mplayer but then it eventually crashes anyway.

      Of course mplayer gives me some output - I can see VA-API Acceleration is being used.

      When it's not used mplayer doesn't crash and movies are played normally (up to 720p).

      I've tried to read this topic (well, it's got 93 pages, I've read about 50% so far) and google for another answers but no luck. Since I've lost 3 days already for now I'd only like to know one thing - am I doing something that's not possible to accomplish with my hardware setup or software? Are all these success reports false or is it just my badluck? Maybe I have to use another versions of software? Could anyone post the versions that work together? :/

      I also tried to compile vlc with --fmpeg-hw option (./configure --enable-libva --without-kde-solid) but I got errors during ./compile:

      Code:
      ERROR   : avio.c: 55:  expected specifier-qualifier-list before 'URLContext'
      avio.c: In function 'OpenAvio':
      ERROR   : avio.c: 78:  implicit declaration of function 'av_register_all'
      ERROR   : avio.c: 96:  implicit declaration of function 'url_open'
      (..much more here, all with avio.c..)

      Comment


      • Originally posted by gbeauche View Post
        Underlying XvBA context. You can't create an XvBA surface without a context. But this is an implementation detail you don't need, neither care, to know about.
        Ok...


        Originally posted by gbeauche View Post
        In summary, the files you provided exhibit the problem without uncommenting anything, i.e. as is included in the sources?
        Yes.


        Originally posted by gbeauche View Post
        On which GPU?
        RV620 LE [Radeon HD 3450]


        Originally posted by gbeauche View Post
        I am sorry but I get a crash instead:

        0x08049868 in av_release_buffer (avctx=0x8071430, pic=0x808c870) at va.c:326
        326 if(ctxt->surfaces[i].id == sid)
        (gdb) bt
        #0 0x08049868 in av_release_buffer (avctx=0x8071430, pic=0x808c870) at va.c:326
        #1 0x004b5e7f in MPV_common_end () from /usr/lib/i686/cmov/libavcodec.so.52
        #2 0xbffff8b8 in ?? ()
        Oh, sorry. Before posting, i removed some obsolete code, moved some parts in functions, and inserted errors...

        In line 1468 (and 1496):
        Code:
        				free_vo(&ctxt);
        				destroy_surfaces(&ctxt);
        				close_avctxt(&ctxt);
        must be:
        Code:
        				close_avctxt(&ctxt);
        				destroy_surfaces(&ctxt);
        				free_vo(&ctxt);
        and line 945:
        Code:
        			int64_t p = ftell(f);
        			ret = ftell(f);
        			fseek(f, p, SEEK_SET);
        			break;
        correct is:

        Code:
        			int64_t p0 = ftell(f);
        			fseek(f, 0, SEEK_END);
        			ret = ftell(f);
        			fseek(f, p0, SEEK_SET);
        			break;
        To provoke a error sooner, you can increase the number of surfaces in line 453 (mplayer uses hardcoded 21 for h264).

        In line 1472, yesterday i inserted "destroy_wnd" and "init_wnd", not extensivly tested yet the difference with/without, but it seems, as the time to get errors changed.
        Therefore i tested with vaTerminate (and vaInitialize) only, but without difference.

        Thomas

        Comment


        • Originally posted by plast View Post
          My hardware setup:
          P4 2.6 GHz HT (s478)
          ATI HD3450 AGP
          1 GB RAM (DDR1)
          For the record, I've tested this setup under Win XP using MPC-HC and DXVA - working good, no hardware issue, CPU load ~ 1-10% even with 1080p.

          Comment


          • Originally posted by tbshl-vdr View Post
            To provoke a error sooner, you can increase the number of surfaces in line 453 (mplayer uses hardcoded 21 for h264).
            I still don't see any VA/XVBA surface leak. Rather, without event pressing "r", I see the RSS increasing. I used valgrind but got nothing, thus assuming the leak occurs through the driver and some other memory allocator.

            With the following video:
            http://www.splitted-desktop.com/~gbe...-%20Teaser.mp4
            memory usage remains constant and below 85 KB.

            BTW, you said this also occurred with SW decoding?

            In line 1472, yesterday i inserted "destroy_wnd" and "init_wnd", not extensivly tested yet the difference with/without, but it seems, as the time to get errors changed.
            Therefore i tested with vaTerminate (and vaInitialize) only, but without difference.
            When doing vaInitialize()/vaTerminate() several times, I see the program crashing in the end in ADL (AMD Display Library). Generally, you only need to call vaInitialize() / vaTerminate() once.

            Comment


            • Hm, actually i do *not* get errors (when using "destroy_wnd" and "init_wnd"), maybe because of reboot (and power off) machine.
              (With mplayer, definitely memory-usage is highly increased).

              Anyway, with every ("r"-)restart, memory-usage gets higher, although only about 0.1% (Also when let playing, but there it seems to stabilize).

              But even with "destroy_wnd" and "init_wnd" there are no errors, there may be a problem.
              I'm trying to implement this in xine-lib, for use with vdr. Every channel-switch means a recreate of surfaces, but creation of windows are not under xin-libs control (Maybe there is a way to do this, but think this would not a good way).

              Originally posted by gbeauche View Post
              BTW, you said this also occurred with SW decoding?
              Yes, also with SW decoding.

              I updated source archive: http://www.vdr-portal.de/board/threa...496#post943496

              Thomas

              Comment


              • xvba-video 0.7.5

                Hi,

                A new version of xvba-video, the XvBA backend to VA-API, is now available at:
                http://www.splitted-desktop.com/~gbe...ne/xvba-video/

                Version 0.7.5 - 05.Oct.2010
                * Add support for GL_TEXTURE_RECTANGLE_ARB textures
                * Add workaround for GLX rendering on Evergreen chips
                * Add vaPutSurface() low-quality scaling flag (VA_FILTER_SCALING_FAST)

                The second change workarounds XvBA/fglrx rendering bugs on Evergreen chips. This consumes extra memory and processing power. Note that behaviour is probably different, depending on the Evergreen chip and/or driver version... Please tell me in that case, thanks.

                Comment


                • It does not work for me with 10-9 driver and HD 5670.

                  Comment


                  • Originally posted by Kano View Post
                    It does not work for me with 10-9 driver and HD 5670.
                    Some things I want beyond the obvious:

                    $ dmesg | grep fglrx
                    (driver version)

                    $ lspci -xvv |grep -A 16 "VGA "
                    (GPU PCI id)

                    $ glxinfo
                    (GLX extensions, among others)

                    $ vaapi_vc1 --size 320x240
                    (a capture of the window only, so the resulting .png file should be of the same size). VC-1 sample looks simpler because I can see a pattern more easily...

                    Thanks.

                    Comment


                    • Originally posted by Kano View Post
                      It does not work for me with 10-9 driver and HD 5670.
                      I get the exact same problem your seeing... Anyways, here's some of the program output:
                      Code:
                      libva: libva version 0.31.1
                      libva: va_getDriverName() returns 0
                      libva: Trying to open /usr/lib/dri/fglrx_drv_video.so
                      libva: va_openDriver() returns -1
                      libva: libva version 0.31.1
                      libva: va_getDriverName() returns 0
                      libva: Trying to open /usr/lib/dri/fglrx_drv_video.so
                      libva: va_openDriver() returns -1
                      system spec:
                      ubuntu 10.04 (amd64) with 2.6.36-rc6 kernel and patch to fglrx.
                      Catalyst 10.9.
                      LibVA 1.0.4-1 (from debian)
                      Radeon hd 5770 video card.
                      1920x1200 display
                      1280x1024 display (secondary)

                      also, here's some of the obvious you said ya didn't want:
                      Code:
                      01:00.0 VGA compatible controller: ATI Technologies Inc Juniper [Radeon HD 5700 Series]
                      	Subsystem: XFX Pine Group Inc. Device 2990
                      	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
                      	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
                      	Latency: 0, Cache Line Size: 64 bytes
                      	Interrupt: pin A routed to IRQ 42
                      	Region 0: Memory at d0000000 (64-bit, prefetchable) [size=256M]
                      	Region 2: Memory at fdcc0000 (64-bit, non-prefetchable) [size=128K]
                      	Region 4: I/O ports at ce00 [size=256]
                      	[virtual] Expansion ROM at fdc00000 [disabled] [size=128K]
                      	Capabilities: [50] Power Management version 3
                      		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                      		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
                      	Capabilities: [58] Express (v2) Legacy Endpoint, MSI 00
                      		DevCap:	MaxPayload 256 bytes, PhantFunc 0, Latency L0s <4us, L1 unlimited
                      			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
                      		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                      Code:
                      [   10.584129] fglrx: module license 'Proprietary. (C) 2002 - ATI Technologies, Starnberg, GERMANY' taints kernel.
                      [   10.612597] [fglrx] Maximum main memory to use for locked dma buffers: 3672 MBytes.
                      [   10.612656] [fglrx]   vendor: 1002 device: 68b8 count: 1
                      [   10.612987] [fglrx] ioport: bar 4, base 0xce00, size: 0x100
                      [   10.613300] [fglrx] Kernel PAT support is enabled
                      [   10.613317] [fglrx] module loaded - fglrx 8.77.5 [Aug 25 2010] with 1 minors
                      [   17.767190] fglrx_pci 0000:01:00.0: irq 42 for MSI/MSI-X
                      [   17.767766] [fglrx] Firegl kernel thread PID: 1400
                      [   17.768033] [fglrx] IRQ 42 Enabled
                      [   18.518153] [fglrx] Gart USWC size:1200 M.
                      [   18.518156] [fglrx] Gart cacheable size:475 M.
                      [   18.518160] [fglrx] Reserved FB block: Shared offset:0, size:1000000 
                      [   18.518162] [fglrx] Reserved FB block: Unshared offset:f93f000, size:3c1000 
                      [   18.518164] [fglrx] Reserved FB block: Unshared offset:3fff4000, size:c000

                      Comment

                      Working...
                      X