Date of Award

1995

Document Type

Thesis

Publisher

Edith Cowan University

Degree Name

Bachelor of Science Honours

Faculty

Faculty of Science, Technology and Engineering

First Supervisor

Dr Hon Nin Cheung

Abstract

Human speech signals are inherently multi-component non-stationary signals. Recognition schemes for classification of non-stationary signals generally require some kind of temporal alignment to be performed. Examples of techniques used for temporal alignment include hidden Markov models and dynamic time warping. Attempts to incorporate temporal alignment into artificial neural networks have resulted in the construction of time-delay neural networks. The nonstationary nature of speech requires a signal representation that is dependent on time. Time-frequency signal analysis is an extension of conventional time-domain and frequency-domain analysis methods. Researchers have reported on the effectiveness of time-frequency representations to reveal the time-varying nature of speech. In this thesis, a recognition scheme is developed for temporal-spectral alignment of nonstationary signals by performing preprocessing on the time-frequency distributions of the speech phonemes. The resulting representation is independent of any amount of time-frequency shift and is time-frequency shift-tolerant (TFST). The proposed scheme does not require time alignment of the signals and has the additional merit of providing spectral alignment, which may have importance in recognition of speech from different speakers. A modification to the counterpropagation network is proposed that is suitable for phoneme recognition. The modified network maintains the simplicity and competitive mechanism of the counterpropagation network and has additional benefits of fast learning and good modelling accuracy. The temporal-spectral alignment recognition scheme and modified counterpropagation network are applied to the recognition task of speech phonemes. Simulations show that the proposed scheme has potential in the classification of speech phonemes which have not been aligned in time. To facilitate the research, an environment to perform time-frequency signal analysis and recognition using artificial neural networks was developed. The environment provides tools for time-frequency signal analysis and simulations of of the counterpropagation network.

Share

 
COinS