Summary:
Sezgin et al.'s paper describes a way to find corners (vertices) of freely drawn symbols in order to break the symbol into lines and arcs. These lines and arcs can then be used to define the symbol, which can be passed into a geometrical recognizer for classification.
The vertex finding algorithm works by locating points of high curvature and minimum pen speed. Curvature is defined as being the change in direction at a given point, and speed is the change in arc length over time. Although the paper does not mention this, the curvature value for a point can be found by using a least squares fit over a window of points from p - k to p + k, and taking the slope of the fit (while watching for vertical lines, of course). Once we have the curvature and speed values for each point in the stroke we can find the local minima points that drop below a speed threshold and local maxima points that rise above a curvature threshold. In this paper the speed threshold is taken to be 90% of the average speed, and the curvature threshold is equal to the average curvature. These minima and maxima are possible vertices for speed and curvature, respectively.
Take the intersection of both the speed and curvature corners to get a starting set of possible corners, and now Sezgin et al. go through the remaining corners and calculate "fits" for each. The curvature metric is equal to the magnitude of the average of two curvature values some k window of points away, divided by the arc length between those points. The speed metric is just 1 - (speed at point / max speed). The remaining corner candidates are then sorted in their respective lists according to high metric value.
The algorithm then takes one remaining corner from each set (remaining speed and remaining curvature) and generates two new "hybrid fits": current corner set + speed candidate, current corner set + curvature candidate. The hybrid fits are then tested using a least squares algorithm between each vertex, and the fit with the least error becomes another possible corner fit for our stroke. More fits are generated until all remaining candidate corners are used up. The best fit is chosen based on an error threshold, and then the one with the least number of corners is chosen from those below the threshold.
Since a least squares fitting does not work with curved regions, for any arcs in a stroke the algorithm approximates the arc using a Bezier curve approximation. If the error on the curve we are trying to approximate is greater than a threshold we split the arc at the middle and create two new curve approximations. These curves are used to determine if fit errors.
The system can then beautify the sketch, which is a trivial process if we already have all of the vertices and the Bezier curve approximations. Overall the system had great results, and users "praised" the research because it allowed them to draw objects free-hand. The system accurately found the corners 96% of the time.
Discussion:
This paper is a cornerstone for vertex finding, pun intended. The research is well discussed, the algorithms are relatively easy to implement, and the results are fantastic.
The main thing that I would have liked to see included in the paper was a mention of where and how the algorithm can make a mistake. From my experience I know that corner finding is difficult on poorly drawn circles. Speed variations for circles can be slight, and if the circle looks more like an oval (as in the number 0) there can be a small protrusion or bump at the bottom of the circle that acts as a corner. I'd be interested to see how his system performed on these types of shapes.
Also, thresholds are very tricky when you take into account shapes that include both polylines and arcs. If it is a complex shape that needs to be drawn in a single stroke a user can sometimes pause in the middle of drawing to think about how to draw the next section. This pause will definitely hit as being a vertex, but it destroys the speed average and hurts the finding of more subtle corners.
Subscribe to:
Post Comments (Atom)
2 comments:
I think you'll find out how it performs on those shapes firsthand.
I think sketch recognition has so many tradeoffs that it eventually becomes about what domain you're interested in and you have to take that into consideration in building the system. (In thinking about your talk on thresholds)
Going back and reading this paper again, I really hated how it was written. I wish the content was explained less ambiguously. Besides that, I agree that the concept in the Sezgin paper makes one think why they didn't think of it themselves. Detecting corners by exploiting the fact that people slow down when they draw them? Like my first college CS teacher once said: "Knows answer when told."
On your other point about users pausing while sketching, I also thought about it, but I didn't consider its downsides. I figured that a user whom was familiar with the algorithm's workings would exploit it. If I were that user, I would intentionally pause longer when I reach a corner so that the recognizer would have an easier time detecting that corner. But there's that sacrifice of naturalness, and on those more subtle corners you mentioned, the user would also have to apply the exploit on those even more. Even though your point was focused on unintentional pausing, that point also seems to lessen the advantages of intentional pausing.
Post a Comment