Usability Study of the Finger Counter, a Human-Computer Interface


Christen Ng and Jessie Burger
CRA DMP Project at Boston University, Summer 2003

Abstract

The Finger Counter is a computer-vision system that identifies the number of fingers held up in front of an inexpensive webcam in real time.  Usability studies were conducted to determine the limitations of the system.  Tests performed included: a voice-interactive test, a drawing test, and a questionnaire that gauged the ease of use of the system, as well as a limitations test that established conservative functional ranges for use of the Finger Counter.

Introduction

Finger Counter is a computer-vision system that counts the numbers of fingers held up in front of a video camera in real time.  The system is designed as a simple and universal human-computer interface: potential applications include educational tools for young children and supplemental input devices, particularly for persons with disabilities.  The interface is language independent and requires minimal education and computer literacy. Finger Counter uses background differencing and edge detection to locate the outline of the hand.  The system then processes the polar-coordinate representation of the pixels on the outline to identify and count fingers: fingers are recognized as protrusions that meet particular threshold requirements.  The system also logs the frequency of different inputs over a given time interval.  The Finger Counter interface was implemented under Linux using Video4Linux.  The system was tested extensively under various lighting and background conditions to find the most favorable environment for the Finger Counter to function.  Such varying conditions were also conducive to establishing limitation criteria for the system.  The following is a compilation of the various tests conducted, their description, and the results obtained:

I. Tests Performed

EXPERIMENT 1 - Voice Interactive Test

Demonstration and Explanation

Test subjects were given a brief introduction to the Finger Counter: how it works, its intended purpose, the types of tests they would complete, and a demonstration on how to use it. They were then given the opportunity to familiarize themselves with the program by playing around with it before the formal testing began. During this trial period, advice was given to the participants as to how to most effectively use the Finger Counter (e.g. hands parallel to camera, maximizing the width between their fingers, orientation and rotation of hand placement in regards to the camera). Once the subjects felt sufficiently comfortable interacting with the program, their performances on the following tests were recorded. The demonstration and explanation took between 2 to 5 minutes. When test subjects were part of a group, the demonstration and explanation was only given once.

What follows is a screenshot of the test in action.  The top left window is for messages to the user.  The window directly below it shows the frame rate and other status information.  The top right window shows raw video.  The bottom right window is the "user feedback window," which shows how many fingers the system recognizes and the location of their fingertips.



The test system was completely automated and did not require action from the administrator beyond initiating the program. Following a request, a user was given up to 5 seconds to make the hand pose or type the key on the keyboard; the system logged the request made and the user's response time in seconds; the log files also determined the number of requests that were not complied with by the deadline (the failures). Misidentified poses or incorrect key presses were ignored by the system. After a recognition was made or the timer ran out, the system waited a random time interval between 4 and 10 seconds before making another request. Users were asked to remove their hands from the field of view between tests; oftentimes, however, they failed to comply with this request.

Test Administration

Each user was permitted some time to play with the drawing program before the test administration. There was no time limit; users took from 30 seconds through approximately 3 minutes to try out the interface before going on to the test.  In the "voice-interactive test," the computer plays a recorded message asking the user to hold up a certain number of fingers or else type a key from "1" through "5" on the keyboard. The audio message is supplemented by a screen message in large type. A typical message might say "please hold up one finger" or "please type 2 on the keyboard." To choose which message to play, the system generated a random number from 1 through 10.  Numbers from 1 through 5 would cause the system to ask for as many fingers. Numbers from 6 through 10 would cause the system to request keys "1" through "5" on the keyboard. Random numbers were chosen "with replacement," so a single test subject could be asked to hold up some hand poses, or type some keys, a number of times and never be requested to hold up other hand poses, or type other keys. There was no set number of requests; some users got as few as 10 requests, while one got 33 requests. The test system was automated and users were given up to 5 seconds to respond. 

The Test Subjects

There were a total of 19 test subjects: 6 high school students, 4 college students, 7 graduate students, and 2 professionals. The majority of college students, graduate students, and professionals were acquaintances of the administrators. The high school students were participants in a math and science summer camp. Many participants had strong technical backgrounds, as their computer experience ranged from 6 to 30 years and daily computer use ranged from 1 to 15 hours. Each test subject completed exactly one test.

Test Conditions

Multiple locations were used to conduct testing. The graduate students and professionals were tested under florescent lighting, with the camera on a tripod looking up at a ceiling about two meters above. The high school kids were tested in the graduate lounge, under florescent and incandescent lighting, with the camera mounted on a tripod facing down half a meter from the table. Subjects made hand gestures in the air above the camera when it was facing upwards and on the table surface below the camera when it was facing downwards. The program ran on a Toshiba Satellite Pro 6000 with an Intel Pentium III Mobile 1.2 GHz processor and 512 MB of RAM.  The Pentium III has a bus speed of 133 MHz.
 

EXPERIMENT 2 - Drawing Test

Demonstration and Explanation

The same test subjects participating in Experiment 1 performed this test, with the addition of five more college students. They received the instructions described above. In addition, they were told about the "drawing test," in which they would be asked to trace a circular template on the screen, first using the Finger Counter interface and then using a computer pointing device.

Test Administration

Each Experiment-2-test subject took the drawing test immediately after taking the voice-interactive test.  Thus, each user already had played with the drawing program and taken the voice-interactive test before taking the drawing test.

The drawing program permits a user to "draw" on the computer screen by moving his or her fingers. The program works as follows:  The screen is initially black.  Once the system determines how many fingers are held up, it tracks the fingertip positions. If the user is holding up one finger, the system draws a small, randomly colored box (essentially a large pixel) on the screen corresponding to the position of the fingertip in the field of view.  As the user moves his or her finger, the boxes may appear as a line or other object.  If two fingers are held up, the system draws a box centered on the midpoint between the two fingers; in this case, the size of the box varies with the distance between the fingertips. If the user holds up three or four fingers, the system draws three or four small boxes corresponding to the fingertip positions. Finally, if five boxes are held up, the system erases the screen. New boxes are drawn as each frame from the camera is processed. Also at each frame, the system fades each colored pixel a little toward black. This keeps the screen from becoming overly cluttered with colors.

For the "drawing test," instead of being completely black, the initial screen contained the outline of a white circle. Each test subject was asked to draw a circle with his or her index finger to match the circular template.  A test subject first moved his or her fingertip until the resulting colored boxes lined up with some point on the circle outline. Then, one of the test administrators pressed a key on the keyboard to initiate the test. The screen cleared, except for the circular template, and the test subject attempted to trace the circle. When the test subject made it all the way around, the test administrator pressed another key, halting the test. At that point, the system reported the average difference between each pixel in the user's drawing and the nearest point on the circular template.

The following is a screen shot of the drawing test.  The left window shows the user's drawing and the circular template.  The right window shows the "user feedback" window.



After the "drawing test" with the Finger Counter, another drawing test was conducted using a computer pointing device. Available devices included a laptop touchpad, a laptop stick pointer, and a trackball pointing device. Each user was asked to choose the pointing device most unfamiliar. As with the Finger Counter drawing test, the screen displayed a circular template to which a test subject maneuvered the screen pointer (a commonly used arrow). Once the arrow was pointing at part of the template, one of the test administrators began the test, at which point a series of small colored boxes trailed the screen pointer as the user traced the circular template.  When the test subject had circumnavigated the template, the administrator stopped the test. The system then computed the average distance using the same method as described above.

The Test Subjects

The test subjects for Experiment 3 included those described above in Experiment 2, as well as an additional five college students. As with Experiment 2, each test subject completed exactly one test.  Three college students opted to use the AccuPoint input device as their unfamiliar pointing device, one subject did not take this test, and the remaining elected to use the trackball.

Test Conditions

The test conditions were the same for the previously listed subjects. The three additional college students were tested in their apartment under incandescent lighting, with the camera mounted on a mini-tripod facing down, approximately half a meter above the table. One of the test subjects completed the test under natural lighting conditions, aided by a portable lamp, with the camera mounted on a tripod facing upwards.
 

EXPERIMENT 3 - Questionnaire

The test subjects who participated in Experiments 1 and 2 were asked to complete the following questionnaire.  Test subjects took from 1 to 5 minutes to do so. 

Name ___________________________________________________

Telephone Number or Email
(in case we have follow-up questions) __________________________

Occupation _______________________________________________
 

How many years of computer experience do you have? years
How many hours per day do you use a computer (on average)? hours
How easy did you find a computer mouse to use (1 = very hard 10 = super easy)? 1  2  3  4  5  6  7  8  9  10
How natural to use is a mouse (1 = completely unnatural 10 = completely intuitive)? 1  2  3  4  5  6  7  8  9  10
We asked you in the test to use a pointing device you were unfamiliar with.  How easy did you find the other pointing device to use (1 = very hard 10 = super easy)? 1  2  3  4  5  6  7  8  9  10
How natural to use is the other pointing device (1 = completely unnatural 10 = completely intuitive)? 1  2  3  4  5  6  7  8  9  10
How easy did you find the Finger Counter to use (1 = very hard 10 = super easy)? 1  2  3  4  5  6  7  8  9  10
How natural to use is the Finger Counter (1 = completely unnatural 10 = completely intuitive)? 1  2  3  4  5  6  7  8  9  10

We would appreciate any comments you might have on the Finger Counter.  In particular, what types of applications do you think would be useful with the Finger Counter?  Do you have any comments, criticisms, or suggestions regarding the Finger Counter or this testing procedure?  Thanks for your time.

 

EXPERIMENT 4 - Limits of Pose Recognition 

For this test, we placed a webcam in settings similar to the previous experiments, i.e., in the graduate computer lab, mounted on a tripod which was placed on the floor with the camera facing upwards. We ran the interface and had it capture frames as well as report poses recognized.  Hands were moved in the following ways:

  1. Toward the camera
  2. Away from the camera
  3. Rotation in the direction of each Euler angle on axes going through the center of the palm
    1. "Pitch":  The hand is rotated so the fingertips are closer to the camera than the palm and vice versa.
    2. "Roll":  The hand is rotated so that the side of the hand including the base of the little finger is closer than the side including the base of the thumb and vice versa.
    3. "Yaw":  The hand is rotated in the plane parallel to the image plane, so that, from the camera's perspective, the fingertips move from side to side while the palm remains more or less fixed.
A digital video camera was set up on a tripod next to the webcam and was used to capture still images of hand positions for "out of plane" rotations, i. e. pitch and roll. A hand was placed over the camera so that the hand was oriented parallel to the camera lens, perpendicular to the bottom of the image frame, and the system properly recognized the hand pose. All tests (distance, pitch, roll, and yaw) were performed on the 3 test administrators using all 5 hand positions.

To determine the limits on the range of the distance of the hand position, the hand was moved toward the camera until recognition failed. An image was taken, using the Finger Counter, of the hand position just before failure of the system, i.e. the closest possible "good" position, and the hand position just after failure of the system, i.e. the farthest "bad" position. The distance at these positions was also measured using a tape measure. The hand was then moved away from the camera until recognition failed, and two images, one of the farthest "good" hand position, one of the closest "bad" hand position, were recorded, as well as the distances.

To determine the range of the yaw of hand positions, the hand was rotated left and right, while remaining parallel to the camera lens, until recognition failed. An image was taken of both of these extremes at the point just before recognition failure and the point just after. The images of the last "good" positions were then analyzed using a software angle measuring tool. Angles were measured from the vertical axis of the image, which was the same as the default orientation of the hand, to a line segment drawn from the center of the palm to the tip of the index finger for hand positions using 1 finger, and to the tip of the middle finger for hand positions using 2, 3, 4, or 5 fingers. The yaw angle in the right direction was regarded as positive, and the left as negative.


Yaw Example


To determine the range of the pitch of positions, the hand was tilted so that the fingertips were either closer to or farther away from the camera lens than the palm, while the hand orientation was kept perpendicular to the bottom of the image, until recognition failed. The digital camera was set up so that the camera lens was parallel to the side of the hand and the angle of pitch could be clearly determined. Images of the "good" and "bad" positions were captured using both the Finger Counter and the digital camera. The images from the digital camera of the last "good" positions were then analyzed using a software angle measuring tool. Angles were measured from the horizontal axis, which was along the default hand position, to a line segment which ran from the point where the thumb joins the hand to the fingertips. If the fingertips were not even, the average fingertip position was used. Downwards pitch was regarded as negative, upwards pitch as positive.

Pitch Example

To determine the range of the roll of hand positions, the hand was rotated left and right in a circular fashion (so that the palm of the hand was turned up) while keeping the hand parallel to the camera lens and perpendicular to the bottom of the camera image. The digital camera was set up so that the lens was viewing the fingertips and the angle of roll could be clearly determined. Images of the "good" and "bad" positions were captured using both the Finger Counter and the digital camera. The images from the digital camera of the last "good" positions were then analyzed using a software angle measuring tool. Angles were measured from the horizontal axis, which was along the default "level" hand position, to a line segment which ran through the plane of the hand from the little finger to the thumb if 5 fingers were in use, or the little finger to the index finger if less than 5 fingers were held up. If the hand position used less than 4 fingers, the knuckles or middle of the fingers, whichever best represented the plane of the hand, was used as a reference point. Left roll was regarded as negative, while roll to the right was regarded as positive.

      
Roll Example

II. Test Results

EXPERIMENT 1

The following table shows the results of the voice-interactive test described in Experiment 1:


Mean
95% Confidence Interval Range
Finger Counter
3.20s
2.98s
3.42s
Keyboard
2.21s
1.99s
2.49s

Prior to conducting Experiment 1, Stephen Crampton administered the same test to a different testing pool.  The only difference in the testing protocols were the testing conditions, specifically the make of the machine running the program.  The test program on the testing done prior to Experiment 1 was run on a Dell Inspiron 8200 with an Intel Pentium IV 1.4 GHz and 256 MB of RAM.  The Pentium IV has a bus speed of 400 MHz.  The difference in results (See Figure 1) may be attributable to the bus speed on the testing machines.  The bus speed would affect how fast large chunks of data, for instance, images, could be sent to and from the CPU.  The Pentium IV bus, used for the earlier experiment, is three times faster than the Pentium III bus, used for Experiment 1.  The frame rate of the Finger Counter interface was approximately 10 frames per second for the earlier conducted experiment and approximately 6 frames per second for Experiment 1.  This difference would very likely affect the mean response time.  One would expect that the confidence interval would be similar, and, in fact, it was:  0.42s for the earlier experiment and 0.44 for Experiment 1.

The following chart shows the average response time by finger for the Finger Counter interface in the prior experiment and Experiment 1.  In both experiments, one finger had the shortest response time while five fingers had the longest.

Figure 1. Response Times for the Finger Counter

EXPERIMENT 2
For the drawing test, the average distance between the drawing and the circular template was 12.6 pixels for the Finger Counter and 6.3 pixels for the other pointing device.  The respective standard deviations were 9.8 pixels and 3.8 pixels.  Figure 3 shows a box plot of the results.  The polygons show the middle quartiles surrounding the median, signifying a 95% confidence interval, the dotted lines connect the nearest samples within 1.5 times the inter-quartile ranges, and the circles and plus show outliers.




EXPERIMENT 3
From the questionnaire responses, the years of computer experience ranged from 6 to 30.  Hours per day ranged from 1 to 15, with a spike (6 respondents) at 8 hours per day.  The following table shows the mean and standard deviation of the 1-to-10 rankings for "ease of use" and "naturalness."  The mouse ranked very high for both ease of use and "naturalness," the Finger Counter was the next "easiest to use" and second most "natural."


Ease of Use
"Naturalness"

Mean
SD
Mean
SD
Mouse
9.0
1.3
9.2
0.9
Finger Counter
7.0
2.2
7.0
2.5
Other Pointing Device
6.5
2.1
6.1
2.3
 

EXPERIMENT 4
The following graphs display the most conservative range of hand positions in which the Finger Counter functions in regards to our limitations test.  A range was found individually for each finger, which each red bar displays as the smallest maximum value and the largest minimum value.  These then determined the overall range in which the Finger Counter functioned for all hand postures, which is shown by the shaded gray area.