Amazon.com vs. INNOPAC:
Which Interface is Easier to Search?
Elisabeth Riba
Simmons College
Graduate School of Library and Information Science
LIS 403
Professor James Baughman
Fall 2000
Abstract:
This study
will compare the usability of simple searches between the web interfaces for
Amazon.com and the INNOPAC web-based OPAC produced by Innovative Interfaces,
Inc.
Focusing on able-bodied adult
users, the interfaces will be evaluated using two methods.
Novice users will participate in direct
testing of the software.
Expert users
will be evaluated through use of GOMS models to predict user behavior.
Introduction:
Amazon.com
was founded in 1995 as an online bookseller.
Over the last five years it has grown enormously.
The company now claims to have over 20
million customer accounts in over 160 countries (
http://www.amazon.com/exec/obidos/subst/misc/company-info.html
),
and offers "an aggregate of over 13 million titles in books, music and
DVD/video" (
http://yahoo.marketguide.com/mgi/busidesc.asp?nss=yahoo&rt=busidesc&rn=A13EF
).
Amazon claims to be "the leading
online shopping site" and has become a business to be studied and emulated.
Most users
access Amazon.com through a web browser, where they search or browse for books,
music or videos in order to obtain more information and evaluate them for
possible purchase.
This is a similar
reason why library patrons use OPACs (online public access catalogs), which are
also adopting web interfaces.
Although
Amazon.com and OPACs serve the same basic function, there are some very
fundamental differences in their business and technological models, which may
result in different user experiences.
Amazon.com
makes much of its money through direct sales.
Its users are its customers, which provides Amazon with direct monetary
feedback on the success or failure of its interface.
Because Amazon.com has centralized control
over its web site, it can track user interactions and change its interface more
dynamically.
(For an amusing example of
this, see
http://www.amazon.com/exec/obidos/subst/home/all-stores-ballot.html
)
Although
OPACs are primarily used by library patrons, it is the library staff who makes
purchasing decisions.
Library staffs
have different computing needs from patrons, and may have other priorities in
selecting an OPAC beyond ease-of-use, such as support for existing legacy
systems.
Because OPACs are sold to
client libraries, who then customize and maintain them locally, it is more
difficult for the manufacturers to detect problems in use or to roll out user
interface improvements in a timely manner.
One
potential advantage OPAC interfaces may have over Amazon.com would be
accessibility.
The American Library
Association's Code of Ethics has long included a call for "equitable access"
and libraries are public accommodations required to obey the Americans with
Disabilities Act.
Thus, OPACs
must
be usable by a diverse population, including the elderly and disabled.
It is within Amazon.com's economic interests
to reach those groups as well, but they have no legal requirements to do
so.
Matson & Sullivan (2000) found
"a weak suggestion that there may be a fundamental relationship between content
accessibility and overall usability" but theirs was a preliminary study with
further research still needed.
Statement
of the Problem:
The purpose
of this study is to determine whether behind-the-scenes differences in business
and technical models are reflected in the user experience.
Is Amazon.com easier to use than a web-based
OPAC?
Because ease-of-use covers such a
large area, this study will focus on searching behavior, which is one of the
primary uses of both interfaces.
Innovative
Interface's INNOPAC was chosen as the OPAC to compare with Amazon.com.
Criteria for OPAC included market share and
currency.
Testing with an OPAC that was
popular but outdated wouldn't be equitable, but testing with the most recent
system would be meaningless if few libraries use it.
According to
the listing of web-based OPACs on Webcats (
http://www.lights.com/webcats
), more
libraries are using INNOPAC, including Simmons, than any other web-based
OPAC.
INNOPAC shows nearly twice the
number of libraries as the runner-up.
Also, the Innovative Interface web site (
http://www.iii.com
)
focuses on new products and updates for web OPAC use.
Review
of Related Research:
Because
Amazon.com has been such a successful business, there have been countless
studies of its interface, geared towards interface designers and other
companies looking to follow the Amazon model.
There are
fewer direct reviews of the INNOPAC interface.
Some librarians have written about their own experiences adopting or
evaluating the program, but there are no generalized reviews of the system.
A few people
in the library science field have noted Amazon.com's success and written
articles on lessons libraries can learn from Amazon.
These include, but are not limited to, the
interface design.
Jascó (1998)
suggested that Amazon should sell "customized versions of the Amazon software"
to libraries with conversion packages.
Amazon could offer a "very competitive price" in exchange for a
hotlinked Amazon logo that will repeat a patron's library search against the
full sale site.
Coffman
(1999) recommended enhancing catalog records with more information, such as
"cover art, jacket blurbs, selections from the text, links to reviews, customer
comments, author interviews and articles, and any other content that would help
a person decide whether to request a particular book."
He also thought librarians "would want to
arrange the records into all kinds of different browsing categories" beyond the
current subject index.
But Coffman was
far more interested in expanding interlibrary loans, offering self-service
checkout, revamping collection development, and other ways libraries might
learn from and emulate Amazon's success.
It was an ambitious idea, and responses to this article can be found in
many library journals.
However, as
interesting as these ideas are, neither article touches on the actual search
process, which is what this study will examine.
Research
Design:
Hypothesis:
The primary
hypothesis for this study is that Amazon.com is easier to use than an
OPAC.
Several sub-hypotheses will test
this conjecture.
- Novice
users can complete basic searches faster using Amazon.com than INNOPAC.
- Expert
users can complete basic searches faster using Amazon.com than INNOPAC.
- Novice
users will receive correct search results more often using Amazon.com than
INNOPAC.
- Novice
users will be satisfied with the search results more often using Amazon.com
than INNOPAC.
- Novice
users will express a subjective preference for the Amazon.com interface over
the INNOPAC interface.
Methodology:
Testing will
focus on able-bodied adult users and not deal with the additional problems
faced by populations with special needs, such as children, disabled,
illiterate, nonnative speakers, and similar groups.
This user base will be divided into two
groups -- novice and expert users.
If
similar results can be garnered across the extremes of user experience, this
can hopefully be extended to users in the middle range.
Different methods will be used to test each
group.
Novice
users:
Novice users will participate in usability
testing, which is a standard method used in the computer industry to evaluate
user interfaces.
Users will
be given a series of tasks they must perform using the two interfaces.
Observers will take notes on the user's
experience and time how long it takes to complete each task.
Users will be encouraged to "think aloud" as
they proceed.
After each task and at the
conclusion of the test, the user will be asked a series of questions to elicit
their subjective reactions.
Usability
testing is a professional field, and numerous books exist explaining how to
conduct tests properly.
Participants
will be screened via a questionnaire to determine their level of computer and
library experience and ensure demographic diversity.
Selected users will be ones who do not use
web browsers in the home or as part of their jobs, and should make only
occasional use of libraries (monthly or less).
Chisman, Diller, & Walbridge (1999) provide a sample screening
questionnaire in their article which will be used as a model.
From this, approximately a dozen subjects
will be selected to take part in the full tests.
Although one
dozen users may seem a small sample from a survey perspective, research has
shown that it should be sufficient for usability testing.
Nielsen & Landauer (1993) studied the
number of problems found in usability tests as a function of the number of
users testing.
|
N = 1 -
(1 - L)
n
|
N = total number of
usability problems
L = proportion of usability
problems discovered while testing a single user
n = number of users
|
Testing with
a large number of users is not as effective, because even if all users find the
same number of problems, later evaluations tend to have more overlap.
Nielsen (1999) noted that if the average
tester finds 31% of the usability problems, then testing with only five users
will uncover 85% of the problems and ten users will find 97.5% of
problems.
Likewise, Chisman, Diller,
& Walbridge wrote that "both literature and [OCLC Usability Lab Director
Mike] Prasse indicated that eight participants would identify 80 percent of the
problems users might have with the system."
Although these papers examine usability testing to find problems in
a
piece of software, rather than comparing usability of two different programs,
this does demonstrate that this study does not need as large a sample
population as suggested by Powell (1997).
Selected
users will be scheduled for private, one hour tests in the usability lab (see
"Institutional Resources" below).
A
"facilitator" will greet all users with a standard opening statement explaining
the users' role, and emphasizing that these tests are to evaluate the system,
and not them.
Users will be presented
with a Tester's Bill of Rights based on Arlov (1997) and asked to sign a
consent form to allow the test to be recorded.
The facilitator will then instruct the users in how to think out loud,
and will provide a brief introduction to the computer system and setup.
Before each
test, the lab computer will already have a web browser running maximized and
open to the main screen of the first interface to be tested.
To minimize the risk of context errors, half
the users will begin testing with the Amazon.com interface, and the other half
will see the INNOPAC interface first.
Every user will be given a folder with each task written on a
separate piece of paper.
Upon
instruction from the facilitator, users will take out the first task, read it
aloud, and then do whatever actions they think are necessary to complete the
task, stopping when they believe they have succeeded.
Reading the task aloud ensures that users
understand what they are supposed to do and helps observers to time the
task.
A sample
task might say "Find information about the book titled
X."
[Specific authors, titles and subjects will
be provided in the actual tests, based upon their availability in the selected
library OPAC and Amazon databases.]
The
facilitator will allow the user to work uninterrupted.
If the user asks for help, the facilitator
will only provide neutral responses.
Observers will take notes on the user's progress, indicating areas of
the process where the user has problems or areas which work particularly
well.
Users will be instructed to
indicate when
they
think they have successfully completed the tasks
The facilitator
will ask the user questions to measure their satisfaction with the process and
results.
Bopp & Smith (1995) note
that user satisfaction does not always depend upon the accuracy of the results,
so both must be taken into account.
The
observers will decide whether the received result was correct, however this
will not be conveyed to the users.
[Some
possible outcomes include users mistakenly giving up too soon, finding the
wrong record, not recognizing the correct record as the answer and continuing
to search, and so on.]
Then, users
will go on to the subsequent tasks in turn.
These will include searches by author, subject, author and title, and
sorting by date.
Each task will be
conducted in the same manner.
After five
tasks have been completed, the users will switch to the other interface (to
Amazon.com if they were previously using INNOPAC, to INNOPAC if they were using
Amazon).
They will then conduct another
five tasks using this interface.
These
tasks will be identical in function to the previous tasks, but with different
authors, titles and subjects.
Users will
then return to the first interface for more tasks, reinforcing the existing
skills and adding a few more complex elements, such as date limitations.
Overall, users will be presented with a total
of thirty tasks, split equally between the two interfaces.
After
completing all the tasks, the facilitator will ask the user to comment on both
interfaces in general, as opposed to looking at each specific task.
Users will be asked whether they prefer one
interface over the other, and if a preference exists, which one and why.
Users will also be given the opportunity to
ask questions of their own.
Expert
users:
Advanced users are more likely to be familiar
with one or the other interface, but may not have equal experience with
both.
Without an unbiased population,
user testing may not provide as reliable results.
Instead of conducting qualitative testing,
therefore, quantitative techniques should suffice.
As users
become more familiar with an interface, it can be assumed that they will
discover faster ways to use that interface.
Calculating the fastest and most efficient searches can simulate what a
highly experienced user will encounter.
Card, Moran
and Newell (1983) developed GOMS models (Goals, Operators, Methods for
achieving the goals, Selection rules for choosing methods) to describe user
behavior.
This method has also proven
reliable as a tool to predict the amount of time required for users to execute
those tasks.
Given a task
and method of performing that task, the Keystroke-Level Model can be used to
predict "the time an expert user will take to execute the task using the
system, providing he uses the method without error."
In brief, the time it takes to perform a task
is the sum of the times to perform the gestures that comprise that task.
By means of laboratory experiments, Card et
al. determined a set of timings and heuristic rules for different operations:
pressing a key (K), pointing with a mouse
(P), homing hand(s) on keyboard or mouse (H), mentally preparing for the next
step (M), and waiting for system response (R).
Keystroke-level
modeling will be applied to every task in the usability tests to determine how
to conduct the fastest search in each interface.
Treatment
of the data:
Novice
users:
Once all testing is complete, every task will
have been conducted in both interfaces by six users each.
For each task/interface combination,
researchers can calculate the average time needed for completion, the success
ratio, and user satisfaction rating.
These can then be compared with the numbers for that same task conducted
against the other interface.
Different
tasks on the same interface can be compared across time, to judge the learning
curve.
The earliest tasks are expected
to take longest, and there may be bias towards whichever interface the user
encounters first, but it should be possible to compare how users times and
success rates improve.
At the end
of each test session, users are asked to express an overall preference for one
system over the other.
This will be
checked against the users' starting interface to see if there is a bias towards
the interface encountered first.
If a clear
difference is shown between the two interfaces, observers can use their notes
and go back to the videotapes to determine specific areas within the interface
that may have helped or hindered users.
This qualitative analysis can then be used as a basis for future
interface improvements.
Expert
users:
Once keystroke-level modeling has established
an optimal time for each task against each interface, these numbers can be
compared to see if either interface is consistently faster than the other.
Institutional
Resources:
Lotus
Development Corporation, where I work, has a usability lab which would be used
to conduct the tests.
This room features
computers with Internet connections, cameras trained on the user and screen to
videotape the tests, and a separate observer's room behind a one-way mirror.
Limitations
of the Study:
This study
only compares Amazon.com with
one
Web-based OPAC.
This same methodology could be used to
compare other OPACs with Amazon.com or with other OPACs.
Also, other tests could be conducted, further
analyzing search through different means or focusing on other areas of the
interface (browsing rather than searching, for example).
There is
also room to study other populations besides able-bodied adults, such as
children, elderly, disabled, and other groups.
Works Cited:
Arlov, L. (1997). GUI design for dummies. Foster City, CA : IDG Books.
Bopp, R. & Smith, L. (1995). Reference and information services (2nd ed.). Englewood, Colorado : Libraries Unlimited.
Card, S., Moran, T., & Newell, A. (1983). The Psychology of human-computer interaction. Hillsdale, NJ : Lawrence Erlbaum Associates.
Chisman, J., Diller, K., & Walbridge, S. (1999, November). Usability testing: a case study. College & research libraries, 60 (6), 552-569.
Coffman, S. (1999, March). Building Earth's LARGEST library: driving into the future. Searcher, 7 (3), 34-47.
Coffman, S. (1999, July/August). The Response to "Building Earth's largest library." Searcher, 7 (7), 28-32.
Hancock-Beaulieu, M., Robertson, S., & Neilson, C. (1990). Evaluation of online catalogues: an assessment of methods. West Yorkshire, UK : British Library Board.
Jascó, P. (1998, November). If I were Amazon's Jeff Bezos, I would... Information today, 15 (10), 28-29.
Matson, R. & Sullivan, T. (2000, November 16-17). Barriers to use: usability and content accessibility on the web's most popular sites. Proceedings of ACM Conference on Universal Usability. Washington : ACM. 139 - 144.
Nielsen, J., & Landauer, T. (1993, April 24-29). A Mathematical model of the finding of usability problems. Proceedings of ACM INTERCHI '93 conference. Amsterdam : ACM. 206-213.
Nielsen, J. (2000, March 19). Why you only need to test with 5 users. Retrieved from the World Wide Web: http://www.useit.com/alertbox/20000319.html
Powell, R. (1997). Basic research methods for librarians. London: Ablex Publishing
Raskin, J. (2000). The Humane interface. Reading, MA : ACM Press.
|