Summary:
Researchers from Georgia Tech have created a gesture toolkit called GT2k. The purpose behind GT2k is to allow researchers to focus on system development instead of recognition. The toolkit works in conjunction with the Hidden Markov Model Toolkit (HTK) to provide HMM tools to a developer. GT2k usage can be divided into four categories: preparation, training, validation, and recognition.
Preparation involves the developer setting up an initial gesture model, semantic gesture descriptions, and gesture examples. Each model is a separate HMM, and GT2k allows either automatic model generation for novices, or user-generated for experts. Grammars for the model are created in a rule-based fashion and allow for the definition of complex gestures based on simpler ones. Data collection is done with whatever sensing devices are needed.
Training the GT2k models can be done in two ways: cross-validation and leave-one-out. Cross-validation involves separating the data into 2/3 for training and 1/3 for testing. Leave-one-out involves training on the entire set minus one data element, and repeating this process for each element in the set. The results for cross-validation are computed in a batch, whereas the overall statistics for leave-one-out are calculated by each model's performance.
Validation checks to see that the training provided a model that is "accurate enough" for recognition. The process uses substitution, insertion, and deletion errors to calculate this accuracy.
Recognition occurs once valid data is received by a trained model. The GT2k abstracts this process away from the user of the system and calculates the likelihood of each model using hte Viterbi algorithm.
The remainder of the paper listed possible applications for GT2k including: a gesture panel for controlling a car stereo, a blink recognition system, a mobile sign language system, and a "smart" workshop that understands what actions a user is performing.
Discussion:
GT2k seems like a good system that can help beginning researchers more easily add HMMs into their gesture systems without worrying about implementation issues. Yet, the applications mentioned for GT2k are rather weak in both their concept and their results. HMMs are really only "needed" for one of the applications (sign language), whereas the other applications can be done more easily with simple techniques or moving the sensors away from a hand gesture.
This was a decent paper in writing style, presentation, and (possibly) contribution, but I'm curious to know what researchers have used GT2k and the systems they have created with it.
As a side note, I also am unclear as to why leave-one-out training is good, since with a large data set training the system could take a hell of a long time.
Subscribe to:
Post Comments (Atom)
1 comment:
I liked this paper a lot because it has the right idea. But I didn't like this paper as much for the same reasons you had in terms of its execution. A well-written paper with a wonderful idea, just in need of another direction to hit its stride.
Post a Comment