Spell and speak
This example is based on the work of Anotonio Roberts aka hellocatfood who made spel aend spik), a generative video based on the phonemes used by a speech synthesizer. The sources for the project are published as a git repository. This project makes use of multiple command line tools: espeak, sox, python, ffmpeg, imagemagick
Preliminary results by the students of the course
Speak the Audio to a File
espeak -v fr "bonjour je suis une ordinateur. . . tu peux m'addresser via la ligne de command. . . merci" -w bonjour.wav
PAD THE AUDIO
sox bonjour.wav padded.wav pad 3 3
CONVERT TO RAW
sox padded.wav -1 -u -c 1 -r 4000 -t raw rawfile
speak.py
framerate = 10 ; slice=4000/framerate dat = open("rawfile").read() frames = [] import os for i in range(0,len(dat),slice): samples = map(lambda x:ord(x)-128, dat[i:i+slice]) frames.append(max(samples))pics = ["close.png", "semi.png", "open.png"] max_mouthOpen = len(pics)-1
step = int(max(frames)/(max_mouthOpen*2)) for i in range(len(frames)): mouth=min(int(frames[i]/step),max_mouthOpen) if i: if mouth>frames[i-1]+1: mouth=frames[i-1]+1 elif mouth < frames[i-1]-1: mouth=frames[i-1]-1 else: mouth=0 frames[i] = mouth os.system("ln -s %s frame%09d.png" % (pics[mouth],i))
NB You need to change the names of the three images (close, semi, open) to match your inputs and change "png"s to be "jpg" if you are using JPEG images (also in the second to last line!).
Sound to Image
python speak.py
Running the speak.py script maps the raw audio to frames, creating numbered links to one of the three original images corresponding to the loudness level in the audio file.
frames + sound = video
ffmpeg -r 10 -i frame%09d.png -i padded.wav -y voice.mp4
OR
ffmpeg -r 10 -i frame%09d.jpg -i padded.wav -y voice.mp4
"animate" Script version 1
espeak "and now for something completely different" -v en -p 99 -s 50 -w hello.wav
sox hello.wav padded.wav pad 3 3
sox padded.wav -b 8 -e unsigned-integer -c 1 -r 4000 -t raw rawfile
rm frame*.jpg
python speak.py
ffmpeg -r 10 -i frame%09d.jpg -i padded.wav -y output.mp4