Usability Study of the Finger Counter, a Human-Computer Interface
Christen Ng and Jessie Burger
CRA DMP Project at Boston University, Summer 2003
Abstract
The Finger Counter is a computer-vision system that identifies the number of fingers held up in front of an inexpensive webcam in real time.
Usability studies were conducted to determine the limitations of the
system. Tests performed included: a voice-interactive test, a drawing
test, and a questionnaire that gauged the ease of use of the system, as well as
a limitations test that established conservative functional ranges for use of
the Finger Counter.
Introduction
Finger Counter is a computer-vision system that counts the numbers of fingers
held up in front of a video camera in real time. The system is designed as a
simple and universal human-computer interface: potential applications include
educational tools for young children and supplemental input devices,
particularly for persons with disabilities. The interface is language
independent and requires minimal education and computer literacy. Finger Counter uses background differencing and edge detection to locate the
outline of the hand. The system then processes the polar-coordinate
representation of the pixels on the outline to identify and count fingers:
fingers are recognized as protrusions that meet particular threshold
requirements. The system also logs the frequency of different inputs over a
given time interval. The Finger Counter interface was implemented under Linux
using Video4Linux. The system was tested extensively under various lighting and
background conditions to find the most favorable environment for the Finger
Counter to function. Such varying conditions were also conducive to
establishing limitation criteria for the system. The following is a compilation of the various tests conducted, their description, and the results obtained:
I. Tests Performed
EXPERIMENT 1 - Voice Interactive Test
Demonstration and Explanation
Test subjects were given a brief introduction to the Finger Counter:
how it works, its intended purpose, the types of tests they would
complete, and a demonstration on how to use it. They were then given
the opportunity to familiarize themselves with the program by playing
around with it
before the formal testing began. During this trial period, advice was
given to the participants as to how to most effectively use the Finger
Counter (e.g. hands parallel to camera, maximizing the width between
their fingers, orientation and rotation of hand placement in regards to
the camera). Once the subjects felt sufficiently comfortable interacting
with the program, their performances on the following tests were
recorded.
The demonstration and explanation took between 2 to 5 minutes. When
test subjects were part of a group, the demonstration and explanation
was only given once.
What follows is a screenshot of the test in action. The top
left window is for messages to the user. The window directly
below it shows the frame rate and other status information. The
top right window shows raw video. The bottom right window is the "user feedback window," which shows how many fingers the system
recognizes and the location of their fingertips.
The test system was completely automated and did not require action
from the administrator beyond initiating the program. Following a
request, a user was given up to 5 seconds to make the hand pose or type
the key on the keyboard; the system logged the request made and the
user's response time in seconds; the log files also determined the
number of requests that were not complied with by the deadline (the
failures). Misidentified poses or incorrect key presses were ignored by
the system. After a recognition was made or the timer ran out, the
system waited a random time interval between 4 and 10 seconds before
making another request. Users were asked to remove their hands from
the field of view between tests; oftentimes, however, they failed to
comply with this request.
Test Administration
Each user was permitted some time to play with
the drawing program before the test administration. There was no time
limit; users took from 30 seconds through approximately 3 minutes to
try out the interface before going on to the test. In the "voice-interactive test," the computer plays a recorded message asking
the user to hold up a certain number of fingers or else type a key from "1"
through "5" on the keyboard. The audio message is supplemented by a screen
message in large type. A typical message might say "please hold up one finger"
or "please type 2 on the keyboard." To choose which message to play, the system
generated a random number from 1 through 10. Numbers from 1 through 5 would cause the system to ask for
as many fingers. Numbers from 6 through 10 would cause the system to request
keys "1" through "5" on the keyboard. Random numbers were chosen "with
replacement," so a single test subject could be asked to hold up some hand
poses, or type some keys, a number of times and never be requested to hold up
other hand poses, or type other keys. There was no set number of requests; some
users got as few as 10 requests, while one got 33 requests. The test system was
automated and users were given up to
5 seconds to respond.
The Test Subjects
There were a total of 19 test subjects: 6 high school students, 4
college students, 7 graduate students, and 2 professionals. The
majority of college students, graduate students, and professionals were
acquaintances of the administrators. The high school students were
participants in a math and science summer camp. Many participants had strong technical
backgrounds, as their computer experience ranged from 6 to 30 years and
daily computer use ranged from 1 to 15 hours. Each test subject
completed exactly one test.
Test Conditions
Multiple locations were used to conduct testing. The
graduate students and professionals were tested under
florescent lighting, with the camera on a tripod looking up at a
ceiling about two meters above. The high school kids were tested in the
graduate lounge, under florescent and incandescent lighting, with the
camera mounted on a tripod facing down half a meter from the table.
Subjects made hand gestures in the air above the camera when it was
facing upwards and on the table surface below the camera when it was
facing downwards. The program ran on a Toshiba Satellite Pro 6000 with
an Intel Pentium III Mobile 1.2 GHz processor and 512 MB of RAM.
The Pentium III has a bus speed of 133 MHz.
EXPERIMENT 2 - Drawing Test
Demonstration and Explanation
The same test subjects participating in Experiment 1 performed this
test, with the addition of five more college students. They received the instructions described above. In
addition, they were told about the "drawing test," in which they would
be asked to trace a circular template on the screen, first using the
Finger Counter interface and then using a computer pointing device.
Test Administration
Each Experiment-2-test subject took the drawing test immediately after
taking the voice-interactive test. Thus, each user already had
played with the drawing program and taken the voice-interactive test
before taking the drawing test.
The drawing program permits a user to "draw" on the computer screen
by moving his or her fingers. The program works as follows: The
screen is initially black. Once the system determines how many
fingers are held up, it tracks the fingertip positions. If the user is
holding up one finger, the system draws a small, randomly colored box
(essentially a large pixel) on the screen corresponding to the position
of the fingertip in the field of view. As the user moves his or
her finger, the boxes may appear as a line or other object. If
two fingers are held up, the system draws a box centered on the
midpoint between the two fingers; in this case, the size of the box
varies with the distance between the fingertips. If the user holds up
three or four fingers, the system draws three or four small boxes
corresponding to the fingertip positions. Finally, if five boxes are
held up, the system erases the screen. New boxes are drawn as each
frame from the camera is processed. Also at each frame, the system
fades each colored pixel a little toward black. This keeps the screen
from becoming overly cluttered with colors.
For the "drawing test," instead of being completely black, the
initial screen contained the outline of a white circle. Each test
subject was asked to draw a circle with his or her index finger to
match the circular template. A test subject first moved his or
her fingertip until the resulting colored boxes lined up with some
point on the circle outline. Then, one of the test administrators
pressed a key on the keyboard to initiate the test. The screen cleared,
except for the circular template, and the test subject attempted to
trace the circle. When the test subject made it all the way around, the
test administrator pressed another key, halting the test. At that
point, the system reported the average difference between each pixel in
the user's drawing and the nearest point on the circular template.
The following is a screen shot of the drawing test. The left
window shows the user's drawing and the circular template. The
right window shows the "user feedback" window.
After the "drawing test" with the Finger Counter, another drawing
test was conducted using a computer pointing device. Available devices
included a laptop touchpad, a laptop stick pointer, and a trackball
pointing device. Each user was asked to choose the pointing device most
unfamiliar. As with the Finger Counter drawing test, the screen
displayed a circular template to which a test subject maneuvered the
screen pointer (a commonly used arrow). Once the arrow was pointing at
part of the template, one of the test administrators began the test, at
which point a series of small colored boxes trailed the screen pointer
as the user traced the circular template. When the test subject
had circumnavigated the template, the administrator stopped the test.
The system then computed the average distance using the same method as
described above.
The Test Subjects
The test subjects for Experiment 3 included those described above in
Experiment 2, as well as an additional five college students. As with
Experiment 2, each test subject completed exactly one test. Three
college students opted to use the AccuPoint
input device as their unfamiliar pointing device, one subject did
not
take this test, and the remaining elected to use the trackball.
Test Conditions
The test conditions were the same for the previously listed subjects.
The three
additional college students were tested in their apartment under
incandescent lighting, with the camera mounted on a mini-tripod facing
down, approximately half a meter above the table. One of the test subjects
completed the test under natural lighting
conditions, aided
by a portable lamp, with the camera mounted on a tripod facing upwards.
EXPERIMENT 3 - Questionnaire
The test subjects who participated in Experiments 1 and 2 were asked
to complete the following questionnaire. Test
subjects took from 1 to 5 minutes to do so.
Name
___________________________________________________
Telephone Number or Email
(in case we have follow-up questions) __________________________
Occupation
_______________________________________________
How many years of computer
experience do you have? |
years |
How many hours per day do you use a
computer (on average)? |
hours |
How easy did you find a computer mouse
to use (1 = very hard 10 = super easy)? |
1
2 3 4 5 6 7 8 9 10 |
How natural to use is a mouse
(1 = completely unnatural 10 = completely intuitive)? |
1
2 3 4 5 6 7 8 9 10 |
We asked you in the test to use a
pointing device you were unfamiliar with. How easy did you find
the other pointing device to use (1 = very hard 10 = super
easy)? |
1
2 3 4 5 6 7 8 9 10 |
How natural to use is the other
pointing device (1 = completely unnatural 10 = completely
intuitive)? |
1
2 3 4 5 6 7 8 9 10 |
How easy did you find the Finger
Counter to use (1 = very hard 10 = super easy)? |
1
2 3 4 5 6 7 8 9 10 |
How natural to use is the Finger
Counter (1 = completely unnatural 10 = completely intuitive)? |
1
2 3 4 5 6 7 8 9 10 |
We would appreciate any comments you might have on the Finger
Counter. In particular, what types of applications do you
think would be useful with the Finger Counter? Do you
have any comments, criticisms, or suggestions regarding the Finger
Counter or this testing procedure? Thanks for your time. |
EXPERIMENT 4 - Limits of Pose Recognition
For this test, we placed a webcam in settings similar to the previous experiments, i.e., in the graduate computer lab, mounted on a tripod which
was placed on the floor with the camera facing upwards. We ran the interface and had it capture frames as well as report poses recognized.
Hands were moved in the following ways:
- Toward the camera
- Away from the camera
- Rotation in the direction of each Euler angle on axes going
through the center of the palm
- "Pitch": The hand is rotated so the fingertips are closer
to the camera than the palm and vice versa.
- "Roll": The hand is rotated so that the side of the hand
including the base of the little finger is closer than the side
including the base of the thumb and vice versa.
- "Yaw": The hand is rotated in the plane parallel to the
image plane, so that, from the camera's perspective, the fingertips
move from side to side while the palm remains more or less fixed.
A digital video camera was set up on a tripod next to the webcam and was used to capture still images of hand positions for "out of plane"
rotations, i. e. pitch and roll. A hand was placed over the camera so that the hand was oriented parallel to the camera lens, perpendicular
to the bottom of the image frame, and the system properly recognized the hand pose. All tests (distance, pitch, roll, and yaw) were performed
on the 3 test administrators using all 5 hand positions.
To determine the limits on the range of the distance of the hand position, the hand was moved toward the camera until recognition failed.
An image was taken, using the Finger Counter, of the hand position just before failure of the system, i.e. the closest possible "good" position,
and the hand position just after failure of the system, i.e. the farthest "bad" position. The distance at these positions was also measured using
a tape measure. The hand was then moved away from the camera until recognition failed, and two images, one of the farthest "good" hand position,
one of the closest "bad" hand position, were recorded, as well as the distances.
To determine the range of the yaw of hand positions, the hand was rotated left and right, while remaining parallel to the camera lens, until
recognition failed. An image was taken of both of these extremes at the point just before recognition failure and the point just after. The
images of the last "good" positions were then analyzed using a software angle measuring tool. Angles were measured from the vertical axis of
the image, which was the same as the default orientation of the hand, to a line
segment drawn from the center of the palm to the tip of
the index finger for hand positions using 1 finger, and to the tip of the middle finger for hand positions using 2, 3, 4, or 5 fingers.
The yaw angle in the right direction was regarded as positive, and the left as
negative.
Yaw Example
To determine the range of the pitch of positions, the hand was tilted so that the fingertips were either closer to or farther away from the
camera lens than the palm, while the hand orientation was kept perpendicular to the bottom of the image, until recognition failed. The digital
camera was set up so that the camera lens was parallel to the side of the hand and the angle of pitch could be clearly determined. Images of
the "good" and "bad" positions were captured using both the Finger Counter and the digital camera. The images from the digital camera of the
last "good" positions were then analyzed using a software angle measuring tool. Angles were measured from the horizontal axis, which was along
the default hand position, to a line segment which ran from the point where the thumb joins the hand to the fingertips. If the fingertips were
not even, the average fingertip position was used. Downwards pitch was regarded as negative, upwards pitch as positive.
Pitch Example
To determine the range of the roll of hand positions, the hand was rotated left and right in a circular fashion (so that
the palm of the hand was turned up) while keeping the hand parallel to the camera lens and perpendicular to the bottom of
the camera image. The digital camera was set up so that the lens was viewing the fingertips and the angle of roll could be
clearly determined. Images of the "good" and "bad" positions were captured using both the Finger Counter and the digital camera.
The images from the digital camera of the last "good" positions were then analyzed using
a software angle measuring tool. Angles were
measured from the horizontal axis, which was along the default "level" hand position, to a line segment which ran through the plane of
the hand from the little finger to the thumb if 5 fingers were in use, or the little finger to the index finger if less than 5 fingers
were held up. If the hand position used less than 4 fingers, the knuckles or middle of the fingers, whichever best represented the plane
of the hand, was used as a reference point. Left roll was regarded as negative, while roll to the right was regarded as positive.
Roll Example
II. Test Results
EXPERIMENT 1
The following table shows the results of the voice-interactive test described in
Experiment 1:
|
Mean
|
95% Confidence Interval Range
|
Finger Counter
|
3.20s
|
2.98s
|
3.42s
|
Keyboard
|
2.21s
|
1.99s
|
2.49s
|
Prior to conducting Experiment 1, Stephen Crampton administered the same test to
a different testing pool. The only difference in the testing protocols
were the testing conditions, specifically the make of the machine running the
program. The test program on the testing done prior to Experiment 1 was run on a Dell
Inspiron 8200 with an Intel Pentium IV 1.4 GHz and 256 MB of RAM.
The Pentium IV has a bus speed of 400 MHz. The difference in results (See
Figure 1) may be attributable to the bus speed on the testing
machines. The bus speed would affect how fast large chunks of
data, for instance, images, could be sent to and from the CPU.
The Pentium IV bus, used for the earlier experiment, is three times faster than
the Pentium III bus, used for Experiment 1. The frame rate of the
Finger Counter interface was approximately 10 frames per second for the earlier
conducted experiment and approximately 6 frames per second for Experiment 1. This difference would very likely affect the mean response
time. One would expect that the confidence interval would be
similar, and, in fact, it was: 0.42s for the earlier experiment and 0.44
for Experiment 1.
The following chart shows the average response time by finger for the
Finger Counter interface in the prior experiment and Experiment 1. In both
experiments, one finger had
the shortest response time while five fingers had the longest.
Figure 1. Response Times for the Finger Counter
EXPERIMENT 2
For the drawing test, the average distance between the drawing and the
circular template was 12.6 pixels for the Finger Counter and 6.3 pixels
for the other pointing device. The respective standard deviations
were 9.8 pixels and 3.8 pixels. Figure 3 shows a box plot of the
results. The polygons show the middle quartiles surrounding the
median, signifying a 95% confidence interval, the dotted lines connect the nearest samples within 1.5 times
the inter-quartile ranges, and the circles and plus show outliers.
EXPERIMENT 3
From the questionnaire responses, the years of computer experience
ranged from 6 to 30. Hours per day ranged from 1 to 15, with a
spike (6 respondents) at 8 hours per day. The following table
shows the mean
and standard deviation of the 1-to-10 rankings for "ease of use" and
"naturalness." The mouse ranked very high for both ease of use
and "naturalness," the Finger Counter was the next "easiest to use"
and second most "natural."
|
Ease of
Use
|
"Naturalness"
|
|
Mean
|
SD
|
Mean
|
SD
|
Mouse
|
9.0
|
1.3
|
9.2
|
0.9
|
Finger Counter
|
7.0
|
2.2
|
7.0
|
2.5
|
Other Pointing Device
|
6.5
|
2.1
|
6.1
|
2.3
|
EXPERIMENT 4
The following graphs display the most conservative range of hand positions in
which the Finger Counter functions in regards to our limitations test. A
range was found individually for each finger, which each red bar displays as the
smallest maximum value and the largest minimum value. These then
determined the overall range in which the Finger Counter functioned for all hand
postures, which is shown by the shaded gray area.