----------------------------------------------------------------------------- B O S T O N U N I V E R S I T Y Computer Science Department and BU American Sign Language Linguistic Research Group joint C O L L O Q U I U M Wednesday December 3, 1997 3:00pm (Coffee served at 2:45pm) Seminar Room / MCS 135 111 Cummington Street --------------------------------------------------------------------------- American Sign Language Recognition From Visual Input Dimitris Metaxas University of Pennsylvania We present an approach to continuous American Sign Language (ASL) recognition, which uses as input three-dimensional data of arm motions. We use computer vision methods for three-dimensional object shape and motion parameter extraction and an Ascension Technologies Flock of Birds interchangeably to obtain accurate three-dimensional movement parameters of ASL sentences, selected from a 53-sign vocabulary and a widely varied sentence structure. These parameters are used as features for Hidden Markov Models (HMMs). To improve recognition performance, we model context-dependent HMMs and present a novel method of coupling three-dimensional computer vision methods and HMMs by temporally segmenting the data stream with vision methods. We then use the geometric properties of the segments to constrain the HMM framework for recognition. We show in experiments with a 53 sign vocabulary that three-dimensional features outperform two-dimensional features in recognition performance. Furthermore, we demonstrate that context-dependent modeling and the coupling of vision methods and HMMs has the potential of improving the accuracy of continuous ASL recognition. To address coarticulation effects and further improve our recognition results, we experimented with two different approaches. The first consists of training context-dependent HMMs and is inspired by speech recognition systems. The second consists of modeling transient movements between signs and is inspired by the characteristics of ASL phonology. Our experiments verified that the second approach yields better recognition results. This is joint work with C. Vogler. Host: Stan Sclaroff (sclaroff@cs.bu.edu) ---------------------------------------------------------------------------- For directions see http://cs-www.bu.edu/colloquium ----------------------------------------------------------------------------