Rav1e Begins Adding SSE4.1 Support, More x86 Assembly
The Rust-written "rav1e" AV1 video encoder continues working on better performance potential with recent Intel/AMD CPUs.
Recently we reported on rav1e picking up SSSE3 and AArch64 NEON optimizations while this week is more hand-written x86 Assembly (ported from the speedy dav1d decoder) as well as initial SSE4.1 support.
SSE4.1 has been around now in CPUs for the past decade since Intel's Penryn processors. It's about time Rav1e supports SSE4.1 and should help on improving the performance for recent processors at least until it gets some serious Advanced Vector Extensions (AVX) optimizations.
Beyond having more x86 Assembly and initial SSE4.1 support, there is a fix for possible infinite loops and validation for frame size in this week's rav1e update. More details and downloads for the newest rav1e via GitHub.
Recently we reported on rav1e picking up SSSE3 and AArch64 NEON optimizations while this week is more hand-written x86 Assembly (ported from the speedy dav1d decoder) as well as initial SSE4.1 support.
SSE4.1 has been around now in CPUs for the past decade since Intel's Penryn processors. It's about time Rav1e supports SSE4.1 and should help on improving the performance for recent processors at least until it gets some serious Advanced Vector Extensions (AVX) optimizations.
Beyond having more x86 Assembly and initial SSE4.1 support, there is a fix for possible infinite loops and validation for frame size in this week's rav1e update. More details and downloads for the newest rav1e via GitHub.
2 Comments