Thursday, September 6, 2007

MARQS: Retrieving Sketches Using Domain- and Style-Independent Features Learned from a Single Example Using a Dual-Classifier

Summary:

Brandon's paper proposes a search algorithm that can search a laboratory notebook using sketches. The algorithm itself needs to be invariant to user drawing styles and account for rotation and scaling issues in sketches. It also must be able to recognize sketched symbols from only one drawing since a user searching through their notes might have only drawn a certain sketch one time. Brandon created the MARQS system, a media library with sketch queries, in order to test the algorithm.

To reduce rotation variance the algorithm rotates each sketch so that its major axis is horizontal. The major axis is defined as being the line between the two farthest points within the sketch. The four features used are: bounding box aspect ratio, pixel density, average curvature, and the number of corners found. If only one sketch has been drawn before, a query search will run a single classifier which will do a simple check for the error between the query features and the drawn example. A linear classifier is used when there is more than one example sketch during the query.

The average ranking (or classifier rank) for each sketch query was 1.51. As the system was used more often the results showed that the ranking improved since the system moved away from the linear classifier.


Discussion:

The system seems to work pretty well for very simplistic search algorithms, especially since it only uses four features. The downward trend in the ranking preference is also very promising since searches will likely be performed often.

MARQS' main limitation is in the features it can use since the system runs a linear classification on a freely drawn sketch. Each feature has to be invariant to the number of strokes, rotation, and scale. Yet, the feature for the number of perceived corners could easily change between sketches if somebody drew a tree with one jagged line whereas another time they drew it in small dashes. Merging close endpoints together into one larger stroke might be beneficial to the system, and then some other features like average length of strokes might be used.

Another thing to think about is if the journal system is ever implemented, how do you distinguish between a sketch and handwriting? I recently saw some great research from Auckland, but I can't find the paper online to link here. If you really want to know more ask me for a copy.

2 comments:

rg said...

There is also the semantically troubling problem of there being many shapes and sizes of trees. Textually, I can query for 'tree' and refine from there, but sketch limits that kind of general query without some kind of sketch ontology or thesaurus. Just look at the clustering and relevance of tags of Flickr for an example.

- D said...

Another comment about variations within a gesture class. We hit on this in class today, but what if I want trees in general to match a certain album (pictures of participation in Arbor Day, maybe), but then only certain types of trees (pine trees, for instance) to match to my pictures of the Rocky Mountain National Forest in Colorado.