I'm
trying to build a simple word recognition system in python. As a first
step, I needed to find a way to get audio sample data from my microphone
and store it in a numpy array in Python. After a lot of searching and
experimenting I finally found a library that works well for this task: pyalsaaudio.
This small piece of code records roughly two seconds of audio from the default microphone and plots the spectrogram. This was actually a bit tricky to figure out so I thought I'd share the code for anyone out there who might be trying to do this.
#!/usr/bin/python
import struct
import alsaaudio as aa
import numpy as np
import time
import matplotlib.pyplot as plt
from pylab import *
SAMPLERATE = 8000
PERIODSIZE = 160
CHANNELS = 1
CARD = 'default'
inp = aa.PCM(aa.PCM_CAPTURE, aa.PCM_NONBLOCK, CARD)
inp.setchannels(1)
inp.setrate(SAMPLERATE)
inp.setformat(aa.PCM_FORMAT_S16_LE)
inp.setperiodsize(PERIODSIZE)
sound = np.array([0])
if __name__=='__main__':
ctr = 20000
while ctr > 0:
ctr -= 1
l,data = inp.read()
if l:
samples = struct.unpack('h'*l, data);
sound = np.append(sound, np.array(samples))
time.sleep(0.0001)
print(sound.size)
#plt.plot(sound)
#plt.ylabel('amplitude')
#plt.xlabel('time')
#show()
Pxx, freqs, bins, im = specgram(sound, NFFT=1024, Fs=8000, noverlap=900, cmap=cm.gist_heat)
show()
No comments:
Post a Comment