Monday, April 14, 2008

Feature selection for grasp recognition from optical markers

Summary:

Chang et al. reduced the number of markers needed on a vision-based hand grasp system from 30 to 5 while retaining around a 90% recognition rate.

Six different grasps are used for classification: cylindrical, spherical, lumbrical, two-finger pinch, tripod, and lateral tripod. The posterior probabilities for a class yk are modeled with a softmax function, which divides the exp value of an observation sequence with the class weights, divided by the sum of all exp(weights * obs) values.

The weight values are determined by maximum conditional likelihood estimation from the training set of observations and classes (X, Y). Gradient descent is used to find the log likelihood with respect to the weights. Input features are found using a "sequential wrapper algorithm" that examines one feature at a time with respect to a target class.

Grasp data measured 38 objects being grasped with a full set of 30 markers. An "optimal", small set of markers was chosen by forward and backward selection.

The results indicate that the small marker set of 5 markers has between a 92-97% "accuracy retention" rate.


Discussion:

Reducing the number of sensors using the forward and backward selection is nice, but simply having a few more sensors increases the accuracy to the actual plateau point. From 10 on there is almost no change in accuracy, but between 5 and 10 sensors the accuracy can jump 5%, or 1/20, which is a huge percentage when taking into account user frustration.

1 comment:

- D said...

I think you could do even better if you did feature extraction, like PCA or something else, instead of plain feature selection. Or if you did a more clever algorithm, like floating bidirectional feature selection or +l-r stepwise selection.