No announcement yet.

RV600, OpenCL, ffmpeg and blender

  • Filter
  • Time
  • Show
Clear All
new posts

  • danboid
    started a topic RV600, OpenCL, ffmpeg and blender

    RV600, OpenCL, ffmpeg and blender


    My head is now close to bursting with API acronym overload so I was hoping I could just describe to you guys what I want to see my Linux box do and just how far off we are seeing this happen.

    My laptop has a Mobility Radeon HD 2400 so of course I was very excited to see the recent open X driver code drop as well as the announcement of OpenCL as I'd really like to see ffmpeg and/or mencoder being able to harness my GPU to greatly accelerate video encoding and rendering.

    I understand the drivers are dev only at the moment and we'll need to wait until at least the next big xorg release before mortals can get open source RV600 accel without rolling there own and hoping for the best but how does opencl fit into this? I presume that opencl can work independently of X seeing as it isn't just for accelerating graphics so am I right in thinking that first someone needs to write an opencl driver for RV600? Is this already being worked on?

    Then of course someone needs to update ffmpeg so it can take advantage of opencl- has this work already begun or are there no opencl drivers finished yet?

    Then what about blender? Is blender just going to be running straight on top of gallium or gallium and opencl or?? I know gallium isn't finished yet so I would imagine no work has been done on blender yet to get it playing nice with gallium right?
    Last edited by danboid; 01-02-2009, 12:46 PM.

  • Pfanne
    with all that parallel computing going on is there any way to tell which process is eating how much of your gpu power?
    will it be possible use a tool like top to read how much gputime each process is taking up and possibly limit it.
    that would be pretty interesting and useful

    Leave a comment:

  • mtippett
    I thought I'd drop in on this thread to clarify a few things.

    From the Khronos web site

    OpenCL (Open Computing Language) is the first open, royalty-free standard for general-purpose parallel programming of heterogeneous systems. OpenCL provides a uniform programming environment for software developers to write efficient, portable code for high-performance compute servers, desktop computer systems and handheld devices using a diverse mix of multi-core CPUs, GPUs, Cell-type architectures and other parallel processors such as DSPs.
    The fact that OpenCL came out from the GPGPU world is no longer the real point. Each multi-processing component will have it's particular OpenCL workload that it does best. General purpose CPUs are very good a algorithms that are partially, but not completely parallelizable, GPUs are great with massively and easily parallel problems set.

    CPUs are now beginning to accelerate into the multi-core space very quickly ( so utilizing those cores is becoming critical as well.

    Looking forward, we will have *many*, *many* cores at many different locations in the system. The truly complete SW architecture will seamlessly and transparently migrate to the most performant multi-core environment for it's problem domain. This is generally known as heterogenous computing.

    For this to be possible, you need to abstract the assembly language from the compute language - this is what OpenCL does.

    Ironically, you probably need the traditional "compute" domain to begin to migrate to OpenCL on CPUs before you get a large uptake in the GPU space. The ISV ecosystem of OpenCL enabled applications will be critical to the long term compute space ( remember that Microsoft also has compute architecture with DirectCompute ).

    Doing compute on a CPU is a lot easier and a lot less long term maintenance than on the GPU. The underlying ISA doesn't change very often for a CPU, but changes on a GPU very quickly.

    Changing topic back into the earlier parts of this thread, the OpenCL client will be a peer to the OpenGL and Video client drivers. In the context of Open Source, it makes most sense to be part of Mesa (and hence on Gallium3D). To achieve the same as AMD's CPU based OpenCL, you just need to do Gallium3D's OpenCL state tracker + softpipe. Assuming that softpipe is multi-processor aware, you have a the course grain Compute architecture already there.



    Leave a comment:

  • popper
    a functional Python wrapper around OpenCL now exists ....

    good for your initial GPU scripting tests and quick CLI/GUI apps you might finally try and make for the general good of all...

    but it seems someone inside ATI/AMD has seen fit to make sure their OpenCL drivers DO NOT even try and make use of the ATI GPU(s) you payed for, Doh! given all the hype around OpenCL and the emphesis on the 'good and great' GPU speed use it provides you...

    apparently the curent 9.8? drivers ONLY try to make use of the multi CPU cores you might have inside your PC, and even then its got to be SSE 3.x or later, so no use putting an OpenCL capable ATI GPU in that old AMD machine and getting CL speeds for your CL scripted apps,
    you can OC put a cheap and chearful NV OpenCL AND/OR CUDA GPU in there and use eather of his Py cuda/OpenCL wrappers.

    and thats a shame, given the worlds ATI users NEED something good for once in the GPU general (video)processing right now..

    what are these ATI devs and AMD executives thinking !
    after all this time, finally releasing a non GPU OpenCL driver first!!!!! when their UVD enabled but unused GFx chips need a boost against nv cuda ASIC.... and the 3rd partys code drops for it SO Badly.

    if your not coding for it your not defining it, seems apt here.

    but non the less this Python OpenCL wrapper still exists so why not give it a try and see what interesting things and quick apps you can come up with perhaps.
    Last edited by popper; 09-08-2009, 01:17 AM.

    Leave a comment:

  • lamikr
    While you check the 780G I hope you can also verify the 790 in the same time. I have myself 2 x 780g and 1 x 790 motherboards just because of their potential fastness combined with the low power consumption.


    Leave a comment:

  • popper
    if anyone inside or outside ATI/AMD is reviewing the HW assisted APIs, it might be fun to make sure "Dirac In MPEG-TS" is accounted for, and included for some good PR BTW.

    "Dirac In MPEG-TS
    A draft specification has been written for mapping Dirac into MPEG transport streams:

    There is currently no available MPEG2-TS muxing implementation that correctly supports Dirac video streams. Work is in progress to add this functionality to VLC, and should be available soon after VLC 0.9.0 is released. "
    Last edited by popper; 01-05-2009, 06:02 PM.

    Leave a comment:

  • popper
    that implys we NEED ASAP a new AVIVO API subset of the ATI collective CURRENT/near future APIs thats geared exclusively towords all things Encoding/Decoding and TS (Transport Stream)streaming processing related for the near future.

    the worlds broadcasters, HD BR/DVD vendors and even internet streaming are standardising on DVB-* AVC (and that oddity the USA are trying to make people use in the transition from their analogue switchoff) inside TS for digital distribution/processing, so its crazy that we dont already have a concerted effort to put forward and get behind some workable extendable HW assisted video processing subset API ASAP.

    if that means a new simple plugin overlay open microkernel on gfx cards to translate between whats available now and any new API(s) plus simple wrappers to translate between old and new entry/exit points etc so be it....

    most people dont really care how its done , only that it is done, and real usable progress is seen to take place in a timely mannor but perhaps thats OT for this thread.
    Last edited by popper; 01-05-2009, 05:26 PM.

    Leave a comment:

  • bridgman
    Yep, XvMC is very old and was designed around MPEG2. There is discussion of extending it to include newer standards but I don't think there is much consensus on what the new API should be.

    I suspect the community is torn between (a) the fact that we really need a new API to replace XvMC, and (b) the fact that XvMC is already nicely integrated into the X protocol and the server so if the API *can* be made to work you can just add code to the existing driver and it's a lot less work to get something running (as opposed to setting up a new DRI client).
    Last edited by bridgman; 01-05-2009, 04:33 PM.

    Leave a comment:

  • popper
    Originally posted by danboid View Post
    Big Thanks to TechMage89 and Bridgman for elucidating the state of GPGPU support under Linux right now for me.

    It will indeed be interesting to see how the ATI cards running opencl will compare running an app thats also been ported to CUDA on a similar NV card and of course comparing opencl performance on similar gen cards across platforms.

    Has ffmpeg been ported to CUDA yet? Are there any CUDA video encoders?

    What about opencl under Windows? Is it any further ahead?
    if your looking for HW assisted FFMPEG being ported to use OpenCL/OpenGL your in for a very long wait, put simply, working ATI OpenCL or any other API needs to be made to work and be available to all Devs before any porting to it can take place on any OS platform.

    people need to step back and realise that if your API isnt seen to be patched and parts of it included in the likes of the FFMPEG devs lists by active users/devs, it doesnt to all intents and purposes exist today.

    the binary "VDPAU" library header subset of CUDA capable of Mpeg2,AVC/H.264,VC1 etc has just been applyed on the FFMPEG list

    Mon Jan 5 00:56:41 CET 2009
    "[FFmpeg-devel] [PATCH]VDPAU patch for h264 decoding, round 6"

    he's also "sumitted Attached patch is a first version of the patch to support MPEG1 and MPEG2
    hardware decoding with VDPAU."

    Mon Jan 5 18:41:13 CET 2009
    "[FFmpeg-devel] [PATCH]VDPAU patch for MPEG1/2 decoding, round 1"

    it appears that xvmc exists in some form in the FFMPEG codebase, but iv not seen or heard of anyone working with it as yet,nevermind using it in the Mplayer/VLC and other end user apps using or advocating its use today, no patchs or code review seems to be taking place in the Nov,Dec,jan mailing list threads that i can see....and as already said, if your not seen to be making progress and patching at least subsets of the whole end game API , you dont seem to exist....

    is it right that xvmc API only have entrys for Mpeg2 decoding, if so that implys its broken and needs fixing and extending to include easy processing of AVC(lossless),VC1,and Dirac lossless inside TS streams at the very least.
    Last edited by popper; 01-05-2009, 04:12 PM.

    Leave a comment:

  • Kjella
    Originally posted by danboid View Post
    I understand the drivers are dev only at the moment and we'll need to wait until at least the next big xorg release before mortals can get open source RV600 accel without rolling there own and hoping for the best but how does opencl fit into this? (...) Then of course someone needs to update ffmpeg (...) Then what about blender? Is blender just going to be running straight on top of gallium or gallium and opencl or??
    There's so much happening in all parts of the stack that it's giving me a headache, but here's how I've understood how it's supposed to be laid out:

    Today the Mesa driver is huge - it's pretty much the "kitchen sink" of all things graphics. From what I've understood it's being changed so:

    DRM (kernel) / DRI2 (xorg) handles direct rendering, but now rely on the following components:

    1) Generic GEM memory management in kernel
    2) Generic KMS mode setting in kernel

    Then for 3D, what is today the Mesa driver is split off into:

    1) Hardware-specific Gallium3D driver that exposes the lowest level of 3D functionality like universal shaders.
    2) Generic state tracker that takes higher-level langauges like OpenGL, OpenCL, DirectX (possibly) and translates these into Gallium3D calls.

    It is my understanding that Gallium3D is not directly usable by an application - it requires at least some light state tracker on top to keep track of objects in memory, but that you could make a fairly simple "pass-through" with native Gallium3D instructions. I think OpenCL should perform at near raw Gallium3D performance anyway, but then it has to comply with the OpenCL specification while Gallium3D is there to expose all hardware functionality.

    I would think that most applications would target OpenCL, which would become high-performance Gallium3D instructions (if it can't there's something wrong as the whole purpose of OpenCL is to run that kind of stuff and the purpose of Gallium3D to expose the hardware). Gallium3D would then in turn run this on actual shaders and pass it back up.

    Bridgman did raise an interesting point though, which I haven't seen discussed anywhere - if I run an OpenGL application and a DirectX application (there's been talk of porting WINE's DirectX emulation to Gallium3D) or some other combination something has to make sure different state trackers don't use the same shaders. It would be too silly if only one could run at a time, it'd be a little bit like the old sound server problem, only one could grab the output at a time.

    Also note that AMD is really only needs to release enough specs to implement the hardware-specific bits, but that it leaves a huge, huge job to the community in implementing accelerated 3D state tracker(s) for OpenGL, OpenCL, DirectX/WINE and so on.

    Also, I left out quite a bit of simpler accelerations that probably should, in time, be replaced with Gallium3D converters like 2D acceleration and textured video - modern cards don't have any separate 2D engine.

    Apart from everything else, there's also the question of hardware accelerated vidoe, which could be done as generic OpenCL/Gallium3D instructions or exposing custom hardware which would go ouside everything I've talked about here.

    In short, lots of things happening but it's probably a few years until this is all done, it's basicly rewriting most of the X stack top to bottom.

    Leave a comment: