There May Still Be Hope For R600g Supporting XvMC, VDPAU

agd5f replied

10 January 2011, 10:59 PM
Originally posted by efiniti View Post

Any links/updates on this or xvmc support? jw

In theory g3dvl should work with any gallium driver (r300g or r600g, or nouveau for that matter).
Leave a comment:
efiniti replied

10 January 2011, 09:31 PM
Originally posted by 69acid69 View Post

Well, and what about VDPAU for r300g?

Any links/updates on this or xvmc support? jw
Leave a comment:
popper replied

26 November 2010, 01:59 PM
Originally posted by jrch2k8 View Post

well i really want to join in x264 and ffmpeg list but is really hard to kill to birds at the same time in practice, for now my focus is not video decoding as weird as that sounds, let me explain a bit, so far we have a somewhat working MC code left behind by younes manton wich is a bit messy but it proved that works and i have a partial TGSI iDCT code which im not even sure if it will hardlock the GPU or even compile yet (gallium is still a bit hard to work cuz you need to recompile mesa every time and have another mesa ready to step up in case something nasty happens), so for the near future i need to understand better TGSI how the GPU do stuff compared to my CPU knowledge.

now once i have cleaned and converted to TGSI_translate(i think is way more readable than using ureg_XXXX) the MC and iDCT decoding and understanding gallium good enough is where i believe i will spam x264 irc now focused on improve the existant code and begin to code the big boys algorithm like cabac and vp8 deblocking, etc in TGSI, cuz lets face it on the cpu the code exist and works pretty damn good so the issue is taking that code the best way possible to tgsi/opencl/glsl/cg/etc to exploit the GPU power.

i know is a bit of let down but in my experience you work faster understanding the base up to a point it become natural to you and then focus on the bigger part than try to learn and translate both problem at the same time.

ofc up to some point im investigating how video enconding too but my focus is feel like home programing in TGSI and the go to the big sharks of x264 to now focus on a badass GPU implementation of the algorithms and how transforming them into a GPU effective algorithm(yeah many of the current cpu video algorithm are suspected to not perform too good in the GPU without some rethinking, especially algorthm that require too much branching for example cuz well your GPU hate IF's ish code)

ofc this doesn't mean im not interested in talk with them or anything like it, is just that i believe im just too green yet to properly discuss video matter with them yet from a gpu standpoint but i will join the mailing list today to keep in my mail all that useful information waiting until im ready to put the 100% on video coding on TGSI ,)
...

Ohh talking about being to green , i ment to mention to you to make sure read the massive log http://akuvian.org/src/x264/freenode-x264dev.log.bz2 for the last few days and see what i mean about the x264 not caring about how green a person might be, it was fun watching Jason (Dark Shikari, he's on the left incase your wondering, they Need some hat's LOL http://tmpgenc.pegasys-inc.com/en/press/10_1125.html ) give his 2 hour "how to write assembly" crash coarse to the new GCI 2010 http://www.google-melange.com/gci/pr...google/gci2010 students and interested other people etc, actually see them write new 10bit x264 SIMD routines in a day or so from scratch

so you too can ask your questions and get to understand the x264 codebase too and its optimisations etc by actually working on these things for fun and taking that to your other optimisation work
Leave a comment:
bridgman replied

26 October 2010, 07:19 PM
Originally posted by jrch2k8 View Post

btw if anyone from mesa is reading, there is a chance for future developers to export a software only library containing TGSI implementation like an SDK, so you can learn and test the good stuff without have to fight with mesa too until is neccesary?

You're basically talking about a "Gallium3D state tracker" you can build without the rest of Mesa, right ? My guess is that something like that already exists as fallout from the Gallium3D development process, might be worth asking on the mesa-dev list.
Leave a comment:
agd5f replied

26 October 2010, 06:23 PM
One of the hardest parts in writing a graphics driver and also one of the most likely to impact performance is deciding where to place buffers and when (if ever) to migrate them. There are advantages and disadvantages to system memory vs. vram. Lots of factors come into play however (how often CPU needs to access it, length of time buffer will be used, number of fetches in shader program that uses the buffer, etc.).
Leave a comment:
popper replied

26 October 2010, 06:14 PM
Originally posted by agd5f View Post

You can use the GPU to copy data into or out of vram. In either case, the only limitation is the speed of the bus. You can map system memory pages into the GPU's address apce and render to or texture from system memory directly just like vram. See the UploadToSreen and DownloadFromScreen hooks in the ddx EXA driver code for example. The video decode driver just has to decide where it's optimal to place it's buffers or when to migrate them.

yeah , i was just joking about the generic pipe code OC, just keeping it simple, pointing out it would work, but you would be mad to use such a thing in this case....

nice that you pointed out these options thanks, and i hope it becomes far wider known, but its apparently hard to find ANY FAST 3rd party code that implements and/or explained it's real world use so far, but time will tell i guess, unless theres already dev's reading here testing/using these faster options and willing to share their code!
Leave a comment:
jrch2k8 replied

26 October 2010, 05:49 PM
Originally posted by popper View Post

jrch2k8 i assume ttat you didn't actually joint x264dev and ffmpeg-devel and at least leave the client capturing the feeds in a large log buffer to look at later, it's a shame if so, as they both covered some basic GPU and related ground again, try and look at any logs for the last two days or so and you will see interesting stuff perhaps.

more before this
"<spellbound> far-in-out_: I've uploaded the project. http://rd.gnus.org/cuda_motion_estimation.7z
<spellbound> far-in-out_: It's two parts. A dll that implements the ME and has dependencies on the CUDA runtime. A GUI app that exercises the DLL and has dependencies on wxwidgets and ffmpeg.
<spellbound> The GUI app can open any video that ffmpeg can handle, run it through the ME and display the motion vectors on top of the video image for visual verification.
<spellbound> Note though, 1920x1080 video size is hard coded in there in some places, so it will crash on test videos of any other size...." etc

well i really want to join in x264 and ffmpeg list but is really hard to kill to birds at the same time in practice, for now my focus is not video decoding as weird as that sounds, let me explain a bit, so far we have a somewhat working MC code left behind by younes manton wich is a bit messy but it proved that works and i have a partial TGSI iDCT code which im not even sure if it will hardlock the GPU or even compile yet (gallium is still a bit hard to work cuz you need to recompile mesa every time and have another mesa ready to step up in case something nasty happens), so for the near future i need to understand better TGSI how the GPU do stuff compared to my CPU knowledge.

now once i have cleaned and converted to TGSI_translate(i think is way more readable than using ureg_XXXX) the MC and iDCT decoding and understanding gallium good enough is where i believe i will spam x264 irc now focused on improve the existant code and begin to code the big boys algorithm like cabac and vp8 deblocking, etc in TGSI, cuz lets face it on the cpu the code exist and works pretty damn good so the issue is taking that code the best way possible to tgsi/opencl/glsl/cg/etc to exploit the GPU power.

i know is a bit of let down but in my experience you work faster understanding the base up to a point it become natural to you and then focus on the bigger part than try to learn and translate both problem at the same time.

ofc up to some point im investigating how video enconding too but my focus is feel like home programing in TGSI and the go to the big sharks of x264 to now focus on a badass GPU implementation of the algorithms and how transforming them into a GPU effective algorithm(yeah many of the current cpu video algorithm are suspected to not perform too good in the GPU without some rethinking, especially algorthm that require too much branching for example cuz well your GPU hate IF's ish code)

ofc this doesn't mean im not interested in talk with them or anything like it, is just that i believe im just too green yet to properly discuss video matter with them yet from a gpu standpoint but i will join the mailing list today to keep in my mail all that useful information waiting until im ready to put the 100% on video coding on TGSI ,)

btw if anyone from mesa is reading, there is a chance for future developers to export a software only library containing TGSI implementation like an SDK, so you can learn and test the good stuff without have to fight with mesa too until is neccesary?
Leave a comment:
agd5f replied

26 October 2010, 05:35 PM
Originally posted by popper View Post

Or the fact that while its reasonably fast to get data To the GPU, they dont provide a super fast way to get the processed data Back to the CPU as would be a very good thing for all Encode/Decode software and related apps....

You can use the GPU to copy data into or out of vram. In either case, the only limitation is the speed of the bus. You can map system memory pages into the GPU's address apce and render to or texture from system memory directly just like vram. See the UploadToSreen and DownloadFromScreen hooks in the ddx EXA driver code for example. The video decode driver just has to decide where it's optimal to place it's buffers or when to migrate them.
Leave a comment:
popper replied

26 October 2010, 05:33 PM
:0)

see, i like to give some room to advance talking here ( i already knew that OC)perhaps more people will use it in interesting ways If they can get feedback and advice on its use somewhere.

whats the SATD entry point , oops someone wasn't actually thinking about general video use when they added that and perhaps some Very interesting CPU like commands ? but still theres always the next EverGreen revision, but its a start and GOOD, now wheres the super fast cpu feedback channels.
Leave a comment:
bridgman replied

26 October 2010, 04:59 PM
Originally posted by popper View Post

theres Still the BIG problem in that Intel.nv AND AMD Do not include simple things like the very useful SAD/SATD and related assembly instructions and actually place them INSIDE the SIMD section of their GPU's for instance,

Evergreen and up have a 4x4 Sum of Absolute Differences instruction in the SIMD engine. Check the ISA guide for :

SAD_ACCUM_HI_UINT
SAD_ACCUM_PREV_UINT
SAD_ACCUM_UINT

Have fun...
Leave a comment:

Announcement

There May Still Be Hope For R600g Supporting XvMC, VDPAU

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment: