Monday, November 5, 2007

Three main concerns in sketch recognition and an approach to addressing them

Summary:

Mahoney and Fromhertz discuss problems involved with matching models of hand-drawn sketches of stick figures. The figures are in simple polylines, but can be in any configuration with other figures or distracting objects in the background.

The system input (drawn figure) is highly variable, and the authors define certain problems in the variability. Failures of co-termination involve strokes over- or under-shooting one another (i.e. the endpoints of two disjoint strokes that are supposed to be connected do not touch). Articulation problems are encountered when strokes ore over- or under-segmented in preprocessing. Interaction with background context involves the figure to match or recognize set against a background of context strokes, other figures, or noise and distracting data.

In this paper, the matching process involves creating a graph of the model to find (figure) and searching for a mapping between the model and a data subgraph. Ambiguity is handled by adding alternative, plausible substructures to the graph. This happens through proximity linking, virtual junction splitting, spurious segment jumping, and continuity tracing. All four of these methods involve creating new links in the structure by searching for subtle connections, splitting current strokes, merging strokes together, and creating new stroke segments.

Subgraph matching is translated into a constraint satisfaction problem (CSP), where each stroke (node) in the graph is a variable, and the constraints are edges between the nodes. The final match tries to have the smallest link length, and the length of the segments should be in the appropriate ratios. Matching can also be influenced by a priori knowledge that defines certain components.

Discussion:

The use of a CSP seems to work very well in this case. The fact that the system can discern a stick figure in a sea of seemingly random lines is rather amazing, since I can barely see the figure myself.

One issue is that this system cannot seem to discern "fixed" figures, i.e. a square versus a parallelogram. For "loose" figures, the figures must also be connected with endpoint to endpoint. For a system with only a few strokes this can be simple, but if the user was trying to draw, say, a centipede, the system could find many good examples within a collection of strokes.

1 comment:

- D said...

Concerning that sea of strokes where a stick figure was found. I agree it's pretty neat the figure can be found. But I wonder if that's just simply because of the severe restrictions of the type of figure it's looking for (certain number and size of lines, connected in a very specific way by their endpoints). I wonder if you asked it to find a person-like object, but only had general guidelines as to what that person looked like, what would happen.

Still impressive, though.