Announcement

Collapse
No announcement yet.

Mozilla Releases DeepSpeech 0.7 As Their Great Speech-To-Text Engine

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Mozilla Releases DeepSpeech 0.7 As Their Great Speech-To-Text Engine

    Phoronix: Mozilla Releases DeepSpeech 0.7 As Their Great Speech-To-Text Engine

    One of the lesser known Mozilla software efforts is DeepSpeech as a speech-to-text engine built atop TensorFlow with CPU and GPU (CUDA) acceleration. Friday marked a new release of this DeepSpeech software that is yielding great results for converting spoken audio streams to text...

    http://www.phoronix.com/scan.php?pag...Speech-To-Text

  • #2
    This is the only useful open-source voice recognition engine that I know to date...

    Comment


    • #3
      I take this opportunity to recall the existence of the Mozilla Common Voice project, to which everyone can contribute, to be able to provide free data to the DeepSpeech engine, and thus build a free speech recognition software, as well as a quality speech synthesis.
      Last edited by Okki; 04-25-2020, 01:39 AM.

      Comment


      • #4
        It is pretty scary what you can do with machine learning. With a 5 second audio clip you feed that to a machine learning software and clone that voice and make it say whatever you want. So you can make it sound like the president, or your friend, or if you have an enemy, it can make it sound like your enemy. Someone could make it sound like you and and make it say that it admits to murdering someone.
        Last edited by uid313; 04-25-2020, 05:44 PM. Reason: Removed incorrect statement

        Comment


        • #5
          Originally posted by uid313 View Post
          It is not a voice recognition engine. It is a text-to-speech engine.
          Best open an issue on their github and tell them they've got it the wrong way round in the readme.md

          Comment


          • #6
            It is not a voice recognition engine. It is a text-to-speech engine.
            The Mozilla text-to-speech project is TTS.

            Comment


            • #7
              FYI: Mycroft is also using DeepSpeech. Still waiting for my Mark II though.

              Comment


              • #8
                Originally posted by uid313 View Post
                It is pretty scary what you can do with machine learning. With a 5 second audio clip you feed that to a machine learning software and clone that voice and make it say whatever you want. So you can make it sound like the president, or your friend, or if you have an enemy, it can make it sound like your enemy. Someone could make it sound like you and and make it say that it admits to murdering someone.
                Same with photomontage. And like the only technologies, this new tech leaves artifacts, that forensic expert are able to identify to say if it say a fake or true voice. And the game of cat and mouse goes on and on.


                Comment


                • #9
                  Originally posted by stfn View Post
                  FYI: Mycroft is also using DeepSpeech. Still waiting for my Mark II though.
                  This project has always fascinated me. A true, offline, opensource, digital assistant, without all the bullcrap from Amazon, Google and (spy)friends.

                  Comment


                  • #10
                    By the way, this is not CUDA only, it also works fine with AMD's ROCm. Easy to set up with their tensorflow docker container. It also might work with a recent upstream kernel, however I had to use their dkms driver to get my R470 going.

                    Comment

                    Working...
                    X