Also, the kind of acceleration you get this way (that is, MC level) is not very useful, especially with H.264 high profile, where acceleration is most needed.
H.264 data in high profile is normally encoded with CABAC. CABAC decoding needs a lot of processing power and cannot be parallelized well, thus it isn't viable for offload to shaders. If you approach high bitrates (like on bluray), if I remember correctly, CABAC usually becomes the most involving decoding step. And you can't speed it up with MC acceleration at all.
Plain and simple, in my opinion, only full (i.e. bitstream/VLD level) acceleration is worth implementing when it comes to H.264.
VC-1 is different, but it's not nearly as hard to decode as H.264 anyway.