April 1, 2003

The Modified Group Delay Function and Its Application to Phoneme Recognition

Citation

Murthy, H. A., & Gadde, V. (2003, April). The modified group delay function and its application to phoneme recognition. In 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings.(ICASSP’03). (Vol. 1, pp. I-68). IEEE.

Abstract

We explore a new spectral representation of speech signals through group delay functions. The group delay functions by themselves are noisy and difficult to interpret owing to zeroes that are close to the unit circle in the z-domain and these clutter the spectra. A new modified group delay function that reduces the effects of zeroes close to the unit circle is used. Assuming that this new function is minimum phase, the modified group delay spectrum is converted to a sequence of cepstral coefficients. A preliminary phoneme recognizer is built using features derived from these cepstra. Results are compared with those obtained from features derived from the traditional mel frequency cepstral coefficients (MFCC). The baseline MFCC performance is 34.7% while that of the best modified group delay cep strum is 39.2%. The performance of the composite MFCC feature, which includes the derivatives and double derivatives, is 60.7% while that of the composite modified group deal feature is 57.3%. When these two composite features are combined, ≈ 2% improvement in performance is achieved (62.8%). When this new system is combined with linear frequency cepstra (LFC), the system performance results in another ≈ 0.8% improvement.

↓ Download

The Modified Group Delay Function and Its Application to Phoneme Recognition

Abstract

Read more from SRI

Researchers develop materials that can take on the toughest conditions

Podcast: Re-imagining instructional quality and coaching

SRI’s Genome Explorer: Enhanced genome browser delivers better user experience