Research outputs 2013

Audio and visual speech recognition: recent trends

Hao Wei Lee
Kah Phooi Seng, Edith Cowan University
Li-Mian K. Ang, Edith Cowan UniversityFollow

Document Type

Book Chapter

Publisher

IGI Global

Faculty

Faculty of Health, Engineering and Science

School

School of Engineering / Centre for Communications and Electronics Research

RAS ID

17172

Comments

Lee, H. W., Seng, K. P. , & Ang, L. K. (2013). Audio and visual speech recognition: Recent trends. In J. Tian and L. Chen (Eds.). Intelligent image and video interpretation: Algorithms and applications (pp. 42-86). Location: IGI Global. Original book available here

Abstract

This chapter focuses on a brief introduction on the origins of the audio-visual speech recognition process and relevant techniques often used by researchers in the field. Brief background theory regarding commonly used methods for feature extraction and classification for both audio and visual processing are discussed with highlights pertaining to Mel-Frequency Cepstral Coefficient, and contour/ geometric based lips feature extraction with corresponding tracking methods (Yingjie, Haiyan, Yingjie, & Jinyang, 2011; Liu & Cheung, 2011). Proposed solution concepts will include time derivatives of mel-frequency cepstral coefficients for audio feature extraction, Chroma-colour-based (YCbCr) Face segmentation, Feature Point extraction, Localized Active Contour tracking algorithm, and Hidden Markov Models with Vitebri algorithm incorporated. Information contained in this chapter focuses on being informative for novice speech processing candidates but insufficient mastery knowledge. Additional suggested reading materials should assist in expediting field mastery.

DOI

10.4018/978-1-4666-3958-4.ch002

Link to Full Text

COinS

Research outputs 2013

Audio and visual speech recognition: recent trends

Document Type

Publisher

Faculty

School

RAS ID

Comments

Abstract

DOI

Search

Links

Browse

Author Information

Article Locations

Research outputs 2013

Audio and visual speech recognition: recent trends

Authors

Document Type

Publisher

Faculty

School

RAS ID

Comments

Abstract

DOI

Share

Search

Links

Browse

Author Information

Article Locations