Friday, August 31, 2007

Specifying Gestures by Example

Summary:

Rubine's gesture recognition system, GRANDMA, is a single-stroke gesture recognizer and toolkit so that developers can add gestures to their applications. Gestures can be useful when they provide a means for intuitive input. As an example, the paper shows how a gesture-based drawing program (GDP) could use gestures to create simple shapes and edit them. These gestures could be created with GRANDMA by first defining what types (or classes) of gestures will be used and then collecting examples for each class. It was empirically determined that fifteen gesture examples should suffice for each class.

Drawn gestures are composed of an array of time-stamped points. Thirteen features are calculated for each gesture, such as the starting angles of the gesture, the angle of the bounding box, length of the bounding box diagonal, the total length and rotation of the gesture, the smoothness of the gesture, and the time taken to draw the gesture. These features are invariant to gesture placement (i.e. where the gesture was drawn), but they do take into account scaling and rotation.

To classify the gesture we individually dot product the feature vector with a weight vector for each gesture class defined. The dot product with the maximum value is taken to classify the gesture. This weight vector is computed during gesture training. In training each of the gesture examples drawn has a feature vector calculated and the average feature vector of the examples taken. A weight vector for the class is then found by trying to find the defining features for the set of examples using covariance matrices (http://mathworld.wolfram.com/Covariance.html).

Overall the gesture system worked very well, but as the number of gesture classes increased the recognition rate lowered. The number of training examples used increased the recognition rate up to around 50 examples, but after that it appeared that there was either a plateau or overfitting.


Discussion:

Rubine's gesture system is a great paper that shows that sketch recognition can be simple, fast, and reliable if the user is constrained in certain ways. Gestures are easy to define with GRANDMA, and the calculations to classify gestures can happen in real time. The system also had outstanding recognition results with small numbers of gestures between 5 and 10. Even as the number of classes allowed was increased to 30 the recognition rate lowered but was still acceptable at around 96.5%.

The main problem with the system is that it does require a lot of constraints on the users end. Like Palm Pilot Graffiti, over time the user will be accustomed to drawing in a certain way. This isn't necessarily a bad thing. With any new appliance or application people need to be trained to use it. My new toaster works much differently than my old one and I'm still getting adjusted with the settings. Even with newer non-gesture software, such as Tablet PC handwriting recognition, I have grown accustomed to drawing my lower case Ls in cursive since the print version is confused with the number 1 too often. Yet, when an application is thought of as being intuitive there is much less wiggle room for how much training is needed. If Photoshop does not work as I intended I'm likely to blame myself for a mistake whereas if the computer does not recognize my circle gesture I'm more likely to blame the software.

In the case of GRANDMA the rotation and scale constraints are a bit too much in my opinion; I would try to normalize everything to a standardized bounding box to eliminate scale. Yet these could be acceptable in some situations, such as full keyboard gestures where we try to recognize '/' versus '|' versus '1'.


Rubine, D. 1991. Specifying gestures by example. In Proceedings of the 18th Annual Conference on Computer Graphics and interactive Techniques SIGGRAPH '91. ACM Press, New York, NY, 329-337.

http://portal.acm.org/citation.cfm?id=122753

Wednesday, August 29, 2007

Introduction to Sketch Recognition

Summary:

The paper comprises of a brief but comprehensive look at the sketch recognition field. Some of the topics covered include current pen-based hardware, pen-centric software, and various uses for Tablet PCs.

The main focus on Tablet PC use involved an overview on how they could be used in an educational environment. Pen-based technology has been shown to have a mixed affect on student performance in a classroom, but overall the reception by students has been positive. Teachers feel that some Tablets and software help with class lectures, such as using Windows Journal to create presentation templates. Yet, Tablet hardware also limits teachers since they are tethered to a projector through the cables or forced to use a small amount of space. Tablet software can also help students learn on their own. MathPad allows for students to check their math notes and homework and Physics Simulator allows students to model and simulate mechanical engineering diagrams.

In two case studies, teachers in middle and high school evaluated Tablet PCs in their classes. Both teachers used the technology differently but enjoyed using the computers to enhance their teaching. One teacher used Tablets to record either at home or in class demos to archive. The other used Tablets to create presentation templates that they could save and write over during class. The templates could be used annually to save the teacher class preparation time.


Discussion:

A good discussion topic for this paper would be "Where do you think this technology is heading?" and "How can we improve it further for classroom use?" Tablet technology is getting cheaper and software is providing better support for pens. Although a limitation right now is cost of technology (smart boards vs. regular white boards & projectors, notebook vs. tablet) what people should think about is "What if we had no limitations?" It's a fun science fiction question to ask at this time.

Sketchpad

Summary:

Sketchpad was the first sketch recognition system and was developed in the early sixties. The system used a light pen and keyboard buttons in order to interact with the computer to create design-quality drawings. The light pen would draw or select objects on a computer screen while the keyboard would toggle various constraints to place on the current drawing of selection.

The constraint system was one of the key features in Sketchpad that allowed the system to produce drawings that looked good. With constraints a user could draw perfect lines and circles without having to worry about pen noise; they would only hold a button indicating that what will be drawn next is a line or circle. Furthermore, touching up a drawing was easy with constraints that could force selected groups of lines to be horizontal, parallel, or perpendicular. Sketchpad also allowed corner snapping constraints which locked a corner of one object onto an edge or corner of another. The constraint system in Sketchpad was implemented using logic, where constraints were variables and the list of constraints on a system could be evaluated to ensure that constraints are satisfied. Relaxation and one pass are the two constraint satisfaction methods mentioned.

Sketchpad worked best when a sketch required a lot of repeated patterns and shapes. The system could create instances (shallow copies) or copies (deep copies) of drawn objects, which could then be resized, rotated, and moved around the drawing space. A design requiring a lot of repetition, such as the hexagonal example in the paper, can be created very quickly without having to repeatedly draw hundreds of hexagons.

Discussion:

Sketchpad is a great system that was revolutionary for its time. As we discussed in class it was the first sketch recognition system and the paper highlighted some key areas that sketch recognition technology could be beneficial, such as in architecture, art, and engineering. Sketchpad also established some object oriented programming techniques by providing shallow and deep copies of drawn objects.

The use of constraints was a good method to ensure that the drawing looked good, but it also forced the user to remember many keyboard commands. Also, the keyboard can only handle as many constraints as the number of keys if the constraints are kept on separate keys; multiple key combinations could allow for n-factorial more constraints but at a high cost of usability. To alleviate this burden it might have been better to have two separate screens: the drawing screen and a constraint menu screen. The constraint buttons on the side of the Sketchpad window could be shifted to the new constraint menu screen with each button pointing to a menu option. The options would follow from the constraints described in the paper, with drawing and selection constraints in hierarchical menus.

Introduction

Name: Aaron Wolin

Year: First Year PhD student

Email: awolin at neo dot tamu dot edu

Academic Interests:

My current academic interests consist of Sketch Recognition, AI, and HCI. I’ve been involved with sketch recognition projects for over a year now and find them very exciting from an HCI and AI perspective. The use of a pen as input heavily constraints traditional mouse and keyboard input possibilities while simultaneously allowing for new applications, such as handwriting programs. Sketch recognition also requires a great deal of AI since computers need to have knowledge of what is being drawn.

Relevant Experience:

  • Impro-Visor – Research application from Harvey Mudd College teaching amateur jazz musicians to compose solos. Musicians enter notes to a solo within a composition window and advice concerning the musical “correctness” would be displayed to them. The advice manager also allows students to pick from good scales and tones that might fit their solo. http://www.cs.hmc.edu/~keller/jazz/improvisor.html
  • Circuit Diagram Recognizer – This is a continuing sketch recognition research project at Harvey Mudd College. The overall goal of the project is to provide students feedback on their drawn circuit diagrams, such as from class notes. An ideal situation would be for an engineering student to draw a diagram free hand with an accompanying truth table, run it through our recognizer, and then see if the circuit diagram has been implemented correctly based in the truth table provided. The project is heavily AI based since we do not constrain student drawing styles. http://www.cs.hmc.edu/~alvarado/research/sketch.html
  • Document Finding (before OCR) – Another project at Harvey Mudd College involved me working with a document digitizing company called Laserfiche. Our project’s focus was to find and crop documents in pictures taken by digital cameras. This essentially uses the camera as a scanner and would allow for more portability of document digitizing and OCR software.

Why I'm taking this class:

Although I’ve already had some sketch recognition experience my research has been focused on free-style sketch recognition. There are many other areas (gesture based systems, sketch beautification) that I have not explicitly worked with, and I want to expand my knowledge of the field.

What I hope to gain:

A general expansion of my sketch recognition knowledge, as well as having fun and learning techniques that can be applied to my research.

What I'll be doing in 5 years:

I'll (hopefully) be finishing up my research and time here at Texas A&M.

What I'll be doing in 10 years:

I don't know, going to Mars. My current thoughts are to go into industrial research, but anything can change in 10 years. Four years ago I wouldn't have said I'd be going to graduate school. In 10 years anything can happen.

Non-academic interests:

Reading, movies, playing poker, concerts, board games, mixology

Fun Story:

Every year at Harvey Mudd my friends and I would make our own sushi. One of my friends had some sushi making supplies and he taught everybody how to make rolls. During our third year we discovered a website where you can get really fresh fish, but the catch was that you had to order over $50 worth in order to qualify for shipping. The last two years we had so much fish after we had gorged ourselves we were shouting at people outside our door to come on in and eat sushi.