Dav1d 0.9 Released With AVX2-Tuned 10b/12b Decode For Big Speed Boost
The hand-written AVX2-tuned Assembly code was sponsored by Facebook and Netflix to provide significantly better performance for decoding 10-bit and 12-bit AV1 content on modern Intel/AMD processors. AArch64 already enjoyed hand-tuned Assembly for the high bit depth decoding while now thanks to the support of two Internet giants there is this faster 10b/12b decode for AVX2 capable processors, which amounts to Intel Haswell and newer or AMD Excavator and newer.
Dav1d 0.9 also adds ARM64 NEON implementation of FilmGrain and a new API to signal events happening during the decoding process.
David 0.9 downloads via VideoLAN's GitLab. I'll have out some dav1d 0.9 benchmarks shortly via the test profile.