Wednesday, January 23, 2008

American Sign Language Finger Spelling Recognition System

Allen, J., Pierre, K., and Foulds, R. American Sign Language Finger Spelling Recognition System. (2003) IEEE.

Summary:

Allen et al.'s created an ASL recognition system using neural networks and an 18-sensor CyberGlove. The authors propose that a wearable glove recognition system can help translate ASL into English and assist deaf (and even blind) people by allowing them to converse with the hearing unimpaired.

The authors used a character set of 24 letters, omitting 'J' and 'Z' due to their usage of arm motions. Instead, the remaining 24 characters use only hand positions. Data from the CyberGlove was collected and recognized in Matlab program, and a second program called Labview would output the corresponding audio for a recognized character.

The recognition system for ASLFSR is a perceptron network with an input of 18x24 (18 sensors, 24 characters) and a desired output of 24x24 (identity matrix for the recognized symbols). The network was trained with an "adapt" function.

The system worked well for a single user and had results up to 90%.


Discussion:

The authors claim that they can achieve a better level of accuracy by training the network on data from multiple subjects, but I completely disagree. That's like saying a hand-tailored suit fits alright, but the pin-stripe at the blue light special is better since it has been designed for the average Joe.

To improve their accuracy they should improve their model. Perceptrons are not that powerful since they clobber values, and using some different neurons (Adalines?) might improve their results. Also, neural networks sometimes work better with more than just 2 layers, and data from 18 non-distinct inputs would probably benefit from even a 3-layer NN . Multiple layer NNs are notoriously tricky to design "well" (i.e. guess and check).

2 comments:

- D said...

I think what they were trying to say is that with more data from different users, you could get a better idea about the average positions of hands and have templates that were less esoteric and noisy.

But I do agree with your statement that having more users would not improve the recognition accuracy for that one user, since that one user supposedly uses the same esoteric and weird hand signs the same each time.

Brandon said...

well according to the paper, they "tested other networks" and the perceptron worked the best. of course, it would have been nice if they would have told us what those other networks were...