Download v0.5 of the toolkit for Mac OS 10.5 and higher.
last updated 10/22/2010
What is LPC?
"The basic idea behind linear predictive analysis is that a speech sample can be approximated as a linear combination of past speech samples. By minimizing the sum of the squared differences (over a finite interval) between the actual speech samples and the linearly predicted ones, a unique set of predictor coefficients can be determined. (The predictor coefficients are the weighting coefficients used in the linear combination)"
LPC is essentially a way to do source/filter separation. In speech, it works on the premise that the vocal tract can be considered a slowly-time varying filter in which an excitation (glottal pulse and noise) is fed to produce speech. Using linear prediction we can estimate the vocal tract filter coefficients. The error signal between the estimated and the actual speech can be considered the excitation signal (the source). The voice can be resynthesized by running the error signal through the inverse prediction filter or by running a synthesized excitation signal through the inverse filter.
In music, you can use this technique for cross-synthesis by performing linear prediction analysis on two different source signals, and running the resulting excitation of one source into the resonance filter of the other.
Basic LPC block diagram (from Digital Processing of Speech Signals):
The LPC toolkit
The idea behind the LPC toolkit is to separate out many of the pieces of LPC into a number of usable Max/MSP objects which can be ordered in different ways to yield a variety of effects for speech processing, synthesis and cross-synthesis. The goal is to give the user enough flexibility to allow them to easily incorporate the techniques of speech processing in creative, musical ways in real-time.
**NOTE: these tools are intended for making music, not compressing speech.
Current object list
- This object performs the basic linear prediction analysis. It uses the autocorrelation method, so it is guaranteed to produce stable coefficients (within numerical limits). It outputs both PARCOR and filter coefficients, as well as a coefficient index signal, a filter gain signal and a time aligned throughput. If you are going to be quantizing the coefficients, you should use the PARCOR coefficients (less prone to instability due to quantization).
In this implementation of LPC, the user has the option to turn pre-emphasis on or off. When analyzing a speech signal, pre-emphasis reduce the effects of the glottal pulse and radiation, and therefore the linear prediction analysis is more accurately representing just the vocal tract. However, if this is used, de-emphasis should be used during resynthesis to recover the low pass properties of the glottal pulse, etc.
- A high order FIR filter to compute the error signal of the linear prediction analysis. This error can also be thought of us the residual, the excitation or the "source" (as in source/filter separation).
In addition to the input of the signal to be filtered it also has a coefficient input and a coefficient index input.
- A high order IIR filter that acts as the resynthesis filter in LPC. This object is used to impose the analyzed/synthesized spectral envelope of a sound onto an excitation signal (or any signal... be creative).
This object has the option of enabling de-emphasis to return the approximated glottal pulse response that may have been removed before analysis
- In order to synthesize a flexible, parametric excitation signal (for speech) from bandlimited impulse train and noise generators, we must be able to extract the fundamental frequency of the source. The algorithm used in this object is based on IRCAM's Yin algorithm and the Tartini project (for the clarity measure).
- This object outputs a band limited impulse train. The algorithm it implements is the sum of windowed sync (SWS) method described in Tim Stilson's PhD thesis
- This object takes a coefficient signal and coefficient index signal as inputs and outputs a list formatted for Max/MSP's filtergraph~ object to create a frequency response display over the filter coefficients. Max/MSP's filtergraph~ object can only display up to 49th order filters, so the frequency response will be a bit inaccurate at orders > 49 (though the correct general shape will be there)