Voice control (using Raspberry pi)
DSCN1446.JPG
size="square"
size="square"
size="square"

Voice control (using Raspberry pi)

by Mirko

Done with 1 Thymio

Keywords:

Voice Raspberry

Construction projects

None

Proof-of concept combining a Thymio II with a Raspberry pi, running Jasper (open-source voice-control platform). Speech-to-text and text-to-speech are all performed on the Raspberry (ie. not using some web API such as those from Google). As you can see in the video, the voice recognition still needs some fine-tuning to understand my accent…

Needs a USB microphone (but a webcam microphone can do the trick) and optionally a loudspeaker (to hear the robot's feedback). Figuring out the audio input can be tricky — especially if using the default SST tool, Pocketsphinx, your microphone has to support a sample rate of 16000 (my very cheap webcam only supported *higher* frequencies, and downsampling worsened the audio quality, so I got another webcam, still one of the cheapest).

More details:
- Thymio II
- Raspberry pi 2 Model B, running plain Raspbian Jessie (release date 2015-09-24), optionally with a powerbank, so the entire thing will be autonomous
- Aseba (on the Raspberry, works right away :-)
- Jasper (https://jasperproject.github.io/), the installation of which was not straightforward because the install instructions don't reflect the current versions of the Raspberry pi and some of the required or optional tools. I followed the manual installation instructions (Method 3), and ran first into a discrepancy with the ALSA configuration (there is no alsa-base.conf as part of the current Raspbian, but it looks like ALSA will work anyway without any edits). Then, installing Pocketsphinx (speech-to-text), the apt-get approach didn't seem to work with Jasper, so I had to compile from source as described. The next hurdle was OpenFST, which did not install as a dependency of Phonetisaurus, and is not anymore available at the location given. You can get the correct version directly from http://www.openfst.org/twiki/bin/view/FST/FstDownload, and then have to compile. Another file that is not anymore available at the documented location was the Phonetisaurus model. I got it from https://www.dropbox.com/s/kfht75czdwucni1/g014b2b.tgz?dl=0.
Caveat: I did not manage to get to work the STT options Google or Wit.ai (and didn't like the pricing of AT&T). I also failed to find or get to work some of the TTS options.

Jasper lets you write your own python "modules", and it's pretty easy to start from an existing module to then add the Thymio interaction, as per some of the other Thymio projects on this site (see example attached). Asebamedulla needs to be running in the background, of course (asebamedulla "ser:device=/dev/ttyACM0 &").

By the way, I'm confident this setup will also work for other languages, or at least French and German (that's what I'll be trying next, because my English accent evidently sucks ;-).

Download the code

Click here

Link to more info

https://youtu.be/hMgojv0vpyU

Comment this creation!

Add a New Comment
Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License