Friday, 19 June 2015

Analysing sound in Python

I'm trying to build a simple word recognition system in python. As a first step, I needed to find a way to get audio sample data from my microphone and store it in a numpy array in Python. After a lot of searching and experimenting I finally found a library that works well for this task: pyalsaaudio.

This small piece of code records roughly two seconds of audio from the default microphone and plots the spectrogram. This was actually a bit tricky to figure out so I thought I'd share the code for anyone out there who might be trying to do this.
#!/usr/bin/python

import struct
import alsaaudio as aa
import numpy as np
import time
import matplotlib.pyplot as plt
from pylab import *

SAMPLERATE = 8000
PERIODSIZE = 160
CHANNELS   = 1
CARD = 'default'


inp = aa.PCM(aa.PCM_CAPTURE, aa.PCM_NONBLOCK, CARD)
inp.setchannels(1)
inp.setrate(SAMPLERATE)
inp.setformat(aa.PCM_FORMAT_S16_LE)
inp.setperiodsize(PERIODSIZE)

sound = np.array([0])

if __name__=='__main__':
    ctr = 20000
    while ctr > 0:
        ctr -= 1
        l,data = inp.read()
        if l:
            samples = struct.unpack('h'*l, data);
            sound = np.append(sound, np.array(samples))
        time.sleep(0.0001)

    print(sound.size)

    
    #plt.plot(sound)
    #plt.ylabel('amplitude')
    #plt.xlabel('time')
    #show()
    
    Pxx, freqs, bins, im = specgram(sound, NFFT=1024, Fs=8000, noverlap=900, cmap=cm.gist_heat)
    show()

Post a Comment