The GM4500MHD (G45) chipset had h.264 hardware acceleration from the word go... on Windows with Win driver. That was years ago!
What the heck is taking so long for it to get implemented in the Linux driver? The only reason I bought an Intel chipset was so that I could get full Linux support before other chipsets even dreamt of it (like AMD/nVidia) in X.org/Mesa/DRM etc. Now i'm beginning to wonder if I would have been better off with nVidia and proprietary driver.
Tell me about it... I've spent the last 3-4 months discovering how difficult it is to parallelize video decoding (with VP8). I've got a functional OpenCL VP8 decoder, but Functional != Fast.
Originally Posted by agd5f
That would be nice if they could extract some decoding power from older chipsets, too. People with older chipsets have older CPUs, so any improvement is very valuable.
i don't think the information in this news is entirely correct - i can recall clearly that few months ago there was driver and test report http://intellinuxgraphics.org that state VA-API is supported and working on "GMA 4500MHD" and i even remember checking in Wikipedia about device hardware IDs, because 0x2A42 and 0x2A43 was mentioned in the information that they are the working ones. so, even i didn't test personally and i still haven't looked again on http://intellinuxgraphics.org for the information i found there before, i have doubts in the correctness of the news and more specifically about GM45 with device IDs 0x2A42 and 0x2A43.
MPEG-2 VLD is already implemented on GMA 4500MHD. H.264 support is being worked on. I think it was also mentioned that VC-1 won't be supported on those older chips.
Originally Posted by const
why didn't you say something earlier and wait 3-4 months, just read the x264 IRC log
Originally Posted by Veerappan
and realise that all Gfx developers including the Nvidia Pro dev that have tried to date ran away for exactly that reason.
< Dark_Shikari> because all people who try are eaten by the cuda monster
2010-03-24 16:48:05 < Dark_Shikari> basically it needs to demonstrate the implementation of a highly-parallelizable ME algorithm, like hierarchical
to get the idea, then realise they still think its usable in even a limited form (better to do something useful rather than let that Gfx sit there unused) for Encoding , come up with a viable gfx algorithm or two then provide even a simple prototype patch for x264 to start with and get feedback in their IRC dev channel without running away, then have that patch ported to the ffmpeg decoder as a beginning.
you may think its an odd way to do it, read the log, learn from it, write a gfx patch to improve the x264 Encoder then have that ported to ffmpeg, but that has proven time and again to be the best option (soon ffmpeg gets the latest avx assembly ported from x264 almost unchanged ) as they know the video spec's inside out and contribute directly to ffmpeg too as that is their preferred decode code base, and you can learn a lot to take away and use elsewhere.
just a thought for you to consider anyway.
Originally Posted by popper
One key difference: Most of what was in that IRC log (and the linked git repositories) is talking about ENcoding. My project has been for DEcoding. I've managed to convert Subpixel Prediction (sixtap + bilinear), IDCT, Dequantization, and the VP8 Loop filter, but I haven't had a chance to really rip apart and rebuild the algorithms from scratch in a parallel manner, so most of the CL code executes fairly serially (max of ~320 parallel threads, which isn't nearly enough). As far as the entropy decoding, and detokenizing, those aren't really bottlenecks for VP8 in most videos (highest I've seen detokenize was 20% of decoding time, and entropy decoding was normally <5%).
As far as why I didn't say anything earlier. I've been mentioning the project occasionally since last summer, and went through the thesis proposal process back in June-August after getting the ok from Jim Bankowski (former chief of On2) on the WebM Project mailing list. I've been working on it for a while, but didn't really start coding until a few months ago (Nov/Dec). And well, I could've mentioned it on the ffmpeg mailing list, but I figured it'd be best to go straight upstream to the source instead.
Edit: And honestly, I can kinda see why Cuda gets preferred by some developers over OpenCL..
320 threads isn't enough? To be useful it should run on HW having only 80 shaders (such as the current Bobcat), no?
Originally Posted by Veerappan
You normally want a lot more threads than shader cores to cover latency (memory accesses for texture fetches etc..).
Tags for this Thread