Hacking Google Voice API in Linux

You should have seen voice-aware input zones coming with the new google chrome release about a month ago. Yeah it’s a cool way to input text easily without typing for long seconds, with the opportunity to get search results for “laughable clothes” when you say “fashionable clothes”. Seriously i cannot see how this is useful, especially when it comes to desktop PCs.

But there’s a good guy on the internet who happily made good use of it. He made a shell script that listens to your voice and use Google Voice API to decode it and convert it to text. I will be explaining this hack he made so you all can make good use of it.

First thing is we need a url for the API, do we define the API variable

API="http://www.google.com/speech-api/v1/recognize?lang=en"

Note that at the end of it there is this lang parameter, we can make our script more efficient if it would be able to handle multiple languages, let’s put it in a variable, or maybe get it passed as an argument 🙂

if [ -z "$1" ]
  then
    echo "No language supplied, using en\n"
    LANG="en"
  else
    echo "using $1 as language\n"
    LANG="$1"
fi
API="http://www.google.com/speech-api/v1/recognize?lang=$LANG"

Now we need to send to this url a sound file containing our voice. But it’s not that simple of course, we need:

arecord to record our voice over the mic
flac to convert the file format
wget to interact with the api

Make sure these 3 packages are installed, if not, you can always use your package manager like apt-get to install it. The reason we’re converting the file into flac format is that is required by the API itself. Now let’s mix things together!

JSON=`arecord -f cd -t wav -d 3 -r 16000 | flac - -f --best --sample-rate 16000 -o out.flac;\
wget -O - -o /dev/null --post-file out.flac --header="Content-Type: audio/x-flac; rate=16000" "$API"`

As you can see, we did good so far and the script will receive the response in JSON format, so we need to parse it using sed and awk. I already wrote an article about sed here, you want to check it out. This may look freaky but it does the job

UTTERANCE=`echo $JSON\
 |sed -e 's/[{}]/''/g'\
  |awk -v k="text" '{n=split($0,a,","); for (i=1; i<=n; i++) print a[i]; exit }'\
   |awk -F: 'NR==3 { print $3; exit }'\
    |sed -e 's/["]/''/g'`
echo "utterance: $UTTERANCE"

Yeah now we had our script to echo the text! That seems pretty geeky, but how can this be useful? Controlling our PC maybe? why not! To do that we must define string to which the script compares the final text, if it matches one of the strings, it executes the corresponding command.

CMD_LIST_DIRECTORY="list directory"
CMD_WHOAMI="who am i"
if [ `echo "$UTTERANCE" | grep -ic "^$CMD_LIST_DIRECTORY$"` -gt 0 ]; then
     ls .
elif [ `echo "$UTTERANCE" | grep -ic "^$CMD_WHOAMI$"` -gt 0 ]; then
     whoami
fi

We can define countless numbers of commands, i will be working on using arrays for this (maybe one of you can do it for us 🙂 ). You can find a complete script here if you are too lazy to save a new file :p

Guess what, we just made good use of Google Voice API! I will leave you to test it, improve it and why not share it. Your comments are welcome.

8 thoughts on “Hacking Google Voice API in Linux”

Pingback: Using Google Translate in PHP | Jacer Omri's Blog
Xarlos says:

October 22, 2013 at 19:29

Nice little tutorial. If possibly, how would you extend this so that the voice is being read at all times and can stream the output? Having to type the command is perhaps circumventable?
Xarlos.

1. Jacer Omri says:
  
  October 22, 2013 at 19:54
  
  maybe attaching the script to a hotkey?
  
krishnaanaril says:

December 20, 2013 at 08:37

Cool tutorial. Gotta try it today itself…

Kaydarla says:

December 21, 2013 at 13:09

For parsing the JSON response from the API, ‘jq’ command line parser could be used – http://stedolan.github.io/jq/. Cheers.

femalefaust says:

January 16, 2014 at 00:25

you effin’ rock.

brianpatrickpoland says:

July 3, 2014 at 16:52

I am looking for some help on implementing this code, I can pay for lessons on Skype etc please email me or leave response to contact

brianpatrickpoland says:

July 3, 2014 at 16:53

Please contact me I need lessons on this in Skype will pay for tutoring