Announcement

Collapse
No announcement yet.

Google Posts Initial Code For Lyra Speech Codec

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • sandy8925
    replied
    Originally posted by bemerk View Post
    How well does it work with female voices? Studies show that they are currently discrimated against in Web conferencing because high frequencies are often cut out in the transmission and females therefore have a lower chance to express their emotions properly
    Yeah I've noticed some headset mics don't record all the way to 20,000 Hz so people's voices might sound very different.

    Leave a comment:


  • davidbepo
    replied
    Originally posted by Mathias View Post

    At these tiny bitrates, OPUS is not very suited for speech. The question is, is there a reason to go that low since UDP/TCP overhead is on the same level as the lyra bitstream...
    it works great down to 6 KB/s, any lower is just nonsense

    Leave a comment:


  • andreano
    replied
    Originally posted by bemerk View Post
    female voices = high frequencies
    The highest frequencies in speech are not the voices, but the unvoiced parts of consonants. That's why consonants are impossible to distinguish over telephone no matter who is talking.

    If female voices are more often cut out by conferencing applications, depending on what exactly “cut out” means, I would hesitate to suspect the coding format.
    Last edited by andreano; 08 April 2021, 06:43 PM.

    Leave a comment:


  • microcode
    replied
    Originally posted by bemerk View Post
    How well does it work with female voices? Studies show that they are currently discrimated against in Web conferencing because high frequencies are often cut out in the transmission and females therefore have a lower chance to express their emotions properly
    It's like the problems with early color film being not sensitive enough to photograph black people in ordinary lighting conditions; physics is just working against you.

    Leave a comment:


  • elatllat
    replied
    Originally posted by bemerk View Post
    How well does it work with female voices? Studies show that they are currently discrimated against in Web conferencing because high frequencies are often cut out in the transmission and females therefore have a lower chance to express their emotions properly
    I have had so many girls trigger a number tone on non-voip i have to wonder if all the developers were chain smokers or married to one.

    One day we will have good TTS and STT and profile detection for super low bitrates.

    Leave a comment:


  • hotaru
    replied
    Lyra can function at just 3kbps.
    so it's not as good as Codec 2, which can function at just 450bps?

    Leave a comment:


  • Alex/AT
    replied
    Originally posted by lucasbekker View Post
    If it is just a single kernel, it shouldn't be too hard to replace it with some calls to openBLAS or something like that, won't be as fast though...
    Would possibly be a wasted effort because you can expect to hit a patent or two on the way.

    Leave a comment:


  • microcode
    replied
    Originally posted by juarezr View Post
    What I'm thinking and wondering is:
    1. If the machine learning models or the audio assets used to create and train Lyra couldn't be used to further improve Opus?
    2. If a parametric codec like Lyra, WaveNet, or LPCNet couldn't be mixed in the Opus combo for achieving even lower bitrates?
    I think the answer on the second one is maybe.

    The first, the audio assets may be useful for developing codecs generally, but probably the models wouldn't be directly useful to Opus enhancements.

    Leave a comment:


  • microcode
    replied
    Originally posted by davidbepo View Post
    google reinventing the wheel again...
    opus is great for this, no need to make a new one
    Opus works at 3k, but it's not spectacular, and you have to increase the frame size to get good quality at that rate, which cuts against use in voice.

    Leave a comment:


  • juarezr
    replied
    Originally posted by davidbepo View Post
    google reinventing the wheel again...
    opus is great for this, no need to make a new one
    Originally posted by Mathias View Post
    At these tiny bitrates, OPUS is not very suited for speech. The question is, is there a reason to go that low since UDP/TCP overhead is on the same level as the lyra bitstream...
    Good questions!

    Opus already was developed as a "combo" from other 2 codecs:
    • Skype SILK codec for low bitrates and voice
    • Xiph.Org’s CELT codec for music and high-quality sound needs.
    I know that Lyra is a different kind of codec based on speech synthesis instead of the traditional DCT/Quantization/Entropy Coding scheme. It is similar to Jean-Marc Valin's LPCNet codec.

    What I'm thinking and wondering is:
    1. If the machine learning models or the audio assets used to create and train Lyra couldn't be used to further improve Opus?
    2. If a parametric codec like Lyra, WaveNet, or LPCNet couldn't be mixed in the Opus combo for achieving even lower bitrates?

    Leave a comment:

Working...
X