Opus 1.5 Audio Codec Able To Make Extensive Use Of Machine Learning

Written by Michael Larabel in Multimedia on 4 March 2024 at 01:19 PM EST. 28 Comments

Xiph.Org's Opus open-source audio format for lossy audio coding has rolled out Opus 1.5 as a big update that is now making greater use of machine learning.

Opus 1.5 brings a "serious machine learning upgrade" per today's release announcement. The 1.5 demo page sums up the greater machine learning use as:

"This 1.5 release is unlike any of the previous ones. It brings many new features that can improve quality and the general audio experience. That is achieved through machine learning. Although Opus has included machine learning — and even deep learning — before (e.g. for speech/music detection), this is the first time it has used deep learning techniques to process or generate the signals themselves.

Instead of designing a new ML-based codec from scratch, we prefer to improve Opus in a fully-compatible way. That is an important design goal for ML in Opus. Not only does that ensure Opus keeps working on older/slower devices, but it also provides an easy upgrade path. Deploying a new codec can be a long, painful process. Compatibility means that older and newer versions of Opus can coexist, while still providing the benefits of the new version when available.

Deep learning also often gets associated with powerful GPUs, but in Opus, we have optimized everything such that it easily runs on most CPUs, including phones. We have been careful to avoid huge models (unlike LLMs with their hundreds of billions of parameters!). In the end, most users should not notice the extra cost, but people using older (5+ years) phones or microcontrollers might. For that reason, all new ML-based features are disabled by default in Opus 1.5. They require both a compile-time switch (for size reasons) and then a run-time switch (for CPU reasons)."

But as noted the new machine learning functionality is disabled by default.

Opus logo

Opus 1.5 also brings improved AVX2 optimizations, more ARM NEON optimizations, much better packet loss robustness, low-bitrate speech quality enhancements, and support for 4th and 5th order ambisonics.

Opus 1.5 downloads and more information via Opus-Codec.org.

28 Comments