Somebody is listening..
Like Apple’s SIRI or Google’s voice activated assistant? Well, you can have a system like that on your PC – even on your Raspberry PI. The basis for all is voice recognition .
Voice activating your Linux computer is no longer a matter of special software or long nights of compiling. Simply record a sound file, send it to Goggle and – voilá – here is your text. So – I went ahead and gave it a try.A quick shell script to record an audio (arecord) , piped into a flac converter (sox), push it to google via wget and parse the result (here’s a complete package). While this approach worked, I soon discovered that there is one special detail that needed to be addressed:
How long will you record your microphone? Let’s say 3 seconds. What if I start a command after 2.5 seconds? No – that doesn’t work – its just too arbitrary. So I went ahead and wrote my own little alsa based voice recorder. It actually only starts a recording once a special (configurable) volume threshold has been reached. But there was a problem: After listening to the microphone and calculating the volume, my little recorder tool simply forgot the data packets. Once it switched into record mode, it would have already discarded the very beginning of the word(s) to record – making a recording of “omputer” from the spoken “computer”. A special wrap-around buffer solved that problem.
I added a php-wrapper around the voice-command chain to avoid writing data to the disc. The php gets it’s flac-audio from a popen() of my alsa recorder piped into sox. It pushes the data to Google using plain old fsockopen() magic and parses the result via json_decode. Once I have the word, I am free to decide what to do now.
My little system is ready to help me in a few ways. It gives me (spoken) news and weather information, it plays music depending on my wishes – all nice and dandy. But in order to make “Charles” (I named it Charles) really cool, I added a USB driven relay card the my computer. Charles is now able to turn the lights on and off, he starts the fan in my office on my command and even controls my jukebox.
So, if you want to fast-track your voice recognition, get the Raspberry PI AUI Suite . It should work on all Linux boxes. You may also want to replace the required arecord against my little alsa recorder hack as explained above.
Get the mmalsa source-code and simply compile it with
gcc -o mmalsa -lasound mmalsa.c
You may need to install “alsa-lib-devel” in order to be able to compile. And now – go and have fun with your computer’s new ability to listen to you.