I'm
trying to build a simple word recognition system in python. As a first
step, I needed to find a way to get audio sample data from my microphone
and store it in a numpy array in Python. After a lot of searching and
experimenting I finally found a library that works well for this task: pyalsaaudio.
This small piece of code records roughly two seconds of audio from the default microphone and plots the spectrogram. This was actually a bit tricky to figure out so I thought I'd share the code for anyone out there who might be trying to do this.
#!/usr/bin/python import struct import alsaaudio as aa import numpy as np import time import matplotlib.pyplot as plt from pylab import * SAMPLERATE = 8000 PERIODSIZE = 160 CHANNELS = 1 CARD = 'default' inp = aa.PCM(aa.PCM_CAPTURE, aa.PCM_NONBLOCK, CARD) inp.setchannels(1) inp.setrate(SAMPLERATE) inp.setformat(aa.PCM_FORMAT_S16_LE) inp.setperiodsize(PERIODSIZE) sound = np.array([0]) if __name__=='__main__': ctr = 20000 while ctr > 0: ctr -= 1 l,data = inp.read() if l: samples = struct.unpack('h'*l, data); sound = np.append(sound, np.array(samples)) time.sleep(0.0001) print(sound.size) #plt.plot(sound) #plt.ylabel('amplitude') #plt.xlabel('time') #show() Pxx, freqs, bins, im = specgram(sound, NFFT=1024, Fs=8000, noverlap=900, cmap=cm.gist_heat) show()
No comments:
Post a Comment