 |




|
  |
  |
Introducing E-OTI's Education Feature
Dear e-OTI Readers:
I am delighted to launch the first in a regular series of articles
on how the Internet continues, on a global scale, to affect the
way we teach and learn. The first education installment deals
with the launch of the first formal, written Guidelines for Computer-Based
Testing. In the past five years, certification and testing for
the information technology industry have grown an estimated 35
percent every six months, and the trend continues in the new millennium.
Many companies and educational institutions are trying to learn
how to incorporate assessment into the new online learning and
computer-based learning paradigms. James Olsen, who prepared the
article, is a pioneer in the field of computer adaptive testing
and a major contributor to the development of the guidelines.
Part of a year-long effort sponsored by the Association of Test
Publishers and leading information technology companies and educators,
the guidelines are designed to supplement NCME and APA Guidelines
and are available through the Test Publishers Association.
Let us know what you think of this article, and please feel free
to submit pieces that will help our membership become aware of
online innovations and breakthroughs in teaching and learning
worldwide. Please contact me at aranag@earthlink.net; I will be happy to send you author guidelines and to exchange
ideas and links for future articles. Our goal is to have regular
contributing authors who will write pieces that highlight exciting
online educational projectsprojects that further lifelong learning
and the empowerment of individuals and communities in every region
of the world. We also welcome you to send us links to sites and
people who are making a difference in education. In June we will
have an article about the African Information Technology Conference.
Thanks, and have a great month!
Sincerely,
Arana Greenberg
e-OTI Education Editor
aranag@earthlink.net
|
Guidelines for Computer-Based Testing
By James B. Olsen
This article introduces the recent Guidelines for Computer-Based Testing, published by the Association of Test Publishers (ATP). The guidelines
were released at a professional testing conference held on February
17, 2000, at Carmel Valley Ranch in Carmel, California. The conference
was dedicated to the life and memory of Frederic Mather Lord (19122000),
who contributed significantly to the theory and applications of
educational measurement, item response theory, and computer-based
and computerized adaptive testing.
Computers are now standard and pervasive tools that significantly
affect our daily lives. In testing and assessment applications,
they have changed the ways in which tests and assessments are
developed and administered. Computer-based tests are defined as
tests or assessments that are administered by computer in either
stand-alone or networked configuration or by other technology
devices linked to the Internet or the World Wide Web. In the face
of the rapid growth of computer-based testing, the ATP sponsored
the development of formal, written guidelines to help ensure high
measurement quality of computer- and Internet-based tests and
to provide direction for the principles and procedures used for
developing and administering those tests. Guidelines for Computer-Based Testing is intended to supplement, extend, and elaborate on the recently
published Standards for Educational and Psychological Testing (Joint Standards) as they apply to computer-based and Internet-based testing and
assessment.
Audiences for the Guidelines for Computer-Based Testing
The guidelines can appropriately be used by a wide variety of
audiences, including:
- Test development organizationsfor specifying procedures for designing,
developing, field-testing, and validating computer-based tests
- Test publishers and administrators and test delivery organizationsfor
establishing common industry guidelines for communication of test
items, exam scores, and item response information to and from
computer-based testing locations
- Test delivery organizationsfor providing information about how
to achieve high-quality delivery of computer-based tests
- Test takersfor providing information about and descriptions of
the types of test items, tests, and test score interpretations
and test orientations they might encounter when they take a computer-based
test
- Research and evaluation specialistsfor providing information
on current and expected future uses of computer-based tests
- Teachers at educational institutions that administer or use computer-based
testsfor providing information about interpreting test scores
from these exams and using these scores appropriately, as well
as about helping students prepare appropriately for computer-based
tests
- Advanced technology companiesas aids in determining how to create
products and services that might be beneficial in solving current
or future problems and issues in computer-based testing
- General readersfor providing interesting and useful information
Millennium Conference Held to Release the Guidelines for Computer-Based Testing
The February conference consisted of pioneering leaders in computer-based
testing. The conference was called to address emerging applications
and practices that are aligned with the recently completed guidelines.
Dr. Ronald Hambleton, professor of education and psychology and
chairperson of the Research and Evaluation Methods Program at
the University of Massachusetts, opened the conference by commenting
that a strong research base, improved psychometric methods, and
expanded item banks are critical in reaching the potential of
computer-based testing.
Participant presentations included a combination of traditional
and classical test theories and leading-edge technologies. Presenters
discussed best practices associated with computer-based testing
delivery as attendees learned about the potential such delivery
provides for developing and delivering tests with improved validity,
fairness, and reliability due to the increased capability, adaptability,
and realism that computer delivery makes possible. Representatives
from the ATP guidelines committee gave the 150 attendees a historical
overview and current measurement theory context for the guidelines.
Concise working sessions covered the relevance of computer-administered
testing. Representatives from Alpine Media, Educational Testing
Service, the University of Nebraska, HumRRO, Microsoft, Lotus
Development, the Northwest Evaluation Association, Novell, and
Hewlett-Packard presented research in the following areas:
- New strategies for computer-based testing (CBT)
- Educational implications of the ATP guidelines
- Implementing the ATP guidelines
- Applications for high-stakes licensure testing
- Use of innovative item types for testing higher-level cognitive
abilities
- Issues and challenges in test planning and design for CBT
Conference attendees included representatives from industrial/organizational,
clinical, education, certification, and licensure groups. "The
unique size and structure of the conference allowed for great
interaction," said John Oswald, president of the ATP. G. William
Harris, executive director of the ATP, was especially pleased
with the outcome. "We have designed a forum that provides continuous
learning for the testing community at large," he said. "Our plan
is to offer a conference on an annual basis to address the issues
and realities of the computer-based testing arena."
Closing keynote speaker Craig Mills, executive director of examinations
at the American Institute of Certified Public Accountants, summed
it up: "There will be an explosion of new item types and testing
methodologies," he said. "We must be ready with good tools and
approaches to manage this explosion."
Relationships between the Standards for Educational and Psychological Testing and the Guidelines for Computer-Based Testing
Published near the end of 1999, the Standards for Educational and Psychological Testing were adopted by the leadership organizations of the American
Educational Research Association (AERA), the American Psychological
Association (APA), and the National Council on Measurement in
Education (NCME). The document states: "The purpose of publishing
the Standards is to provide criteria for the evaluation of tests, testing practices,
and the effects of test use. Although the evaluation of the appropriateness
of a test or testing application should depend heavily on professional
judgment, the Standards provide a frame of reference to assure that relevant issues are
addressed. It is hoped that all professional test publishers will
adopt the Standards and encourage others to do so." (AERA, APA, NCME, 1999, p. 2)
The committee assisting with development of the guidelines decided
that the guidelines would be written to supplement, extend, and
elaborate on the standards. The committee said that all computer-based
tests should be designed by using the fundamental standards identified
in the six technical areas for 1) test construction, evaluation,
and documentation; 2) reliability and errors of measurement; 3)
test development, and revision; 4) scales, norms, and score comparability;
5) test administration, scoring, and reporting; and 6) supporting
documentation for tests. The committee also recommended that all
computer-based tests be used in accordance with fundamental measurement
standards for test fairness, including fairness in testing and
test use, in the rights and responsibilities of test takers, in
testing individuals with diverse linguistic backgrounds, and in
testing individuals with disabilities.
The Standards make only three specific references to computer-based testing:
Standard 3.12: "The rationale and supporting evidence for computerized
adaptive tests should be documented. The documentation should
include procedures used in selecting subsets of items for administration,
in determining the starting point and termination conditions for
the test, in scoring the test and for controlling item exposure."
(AERA, APA, NCME, 1999, p. 45)
Standard 5.5: "Instructions to test takers should clearly indicate
how to make responses. Instructions should also be given in the
use of any equipment likely to be unfamiliar to test takers. Opportunities
to practice responding should be given when equipment is involved,
unless use of the equipment is being assessed." (AERA, APA, NCME,
1999, p. 63)
Standard 6.11: "If a test is designed so that more than one method
can be used for administration or recording responsessuch as
marking responses in a test booklet, on a separate answer sheet,
or on a computer keyboardthen the manual should clearly document
the extent to which scores arising from these methods are interchangeable."
(AERA, APA, NCME, 1999, 70)
The committee concluded that additional guidelines were needed
to supplement, extend, and elaborate on the Standards when they were applied to computer-based tests. The published
guidelines are separated into two parts. The first part provides
general background and explanations of the guidelines and includes
chapters called "Introduction," "Validity and Test Design," "Test
Development and Analysis," and "Test Administration." The second
part provides the specific computer-based testing guidelines.
Part 2 includes chapters entitled "Planning and Design" (11 guidelines),
"Test Development" (23 guidelines), "Test Administration" (18
guidelines), "Scoring and Score Reporting" (11 guidelines), "Psychometric
Analysis" (13 guidelines), and "Stakeholder Communications" (7
guidelines). To illustrate the breadth of these guidelines, one
guideline from each of the major categories follows.
"Planning and Design," guideline 1.4: "A wide variety of computer-based
[tests] can be designed and developed to meet different purposes.
The test specification for computer-based tests should include:
the test purpose, the content domain definitions, the content
structure for the test items, required response formats for the
test items, sample test items illustrating the response formats,
the number of items to be developed and administered, scoring
and reporting formats and procedures, and test administration
procedures. The test specification should be thoroughly documented."
"Test Development," guideline 2.1: "The test delivery environment
should be evaluated before item authoring begins. It is important
to make sure that items being created can be properly displayed
in the test delivery environment and that test-taker input and
results can be collected, aggregated, and reported. For example,
graphics constraints need to be identified so that item writers
do not create items that have too many colors or require a screen
resolution that is too high for the current test delivery environment."
"Test Administration," guideline 3.1: "The test sponsor should
provide test-takers with clear and concise information regarding
procedures to register for an examination, obtain an authorization-for-testing
document, and scheduling a test appointment."
"Scoring and Score Reporting," guideline 4.7: "The accuracy of
computer scoring algorithms should be established prior to implementation
of the computer-based test."
"Psychometric Analysis," guideline 5.1: "Determine appropriate
reliability indices if different test-takers are given different
items or exercises."
"Stakeholder Communications," guideline 6.2: "When appropriate,
developers of computer-based tests should provide sufficient information
concerning the test purpose, and test content specifications to
test users prior to when the test is available for widespread
administration. This test information should be kept accurate
and as up-to-date as possible."
The six foregoing guidelines show that the guidelines provide
supplemental and elaborative information on the Standards for individuals and organizations seeking to develop computer-based
or Internet-based tests and assessments.
For further information on the guidelines, contact the Association
of Test Publishers, 1201 Pennsylvania Avenue, Suite 300, Washington,
DC, 20004. The phone number in the U.S. is 202-857-8444.
Comparisons of the ATP Guidelines with Previous Professional Efforts
Early efforts to determine technical guidelines for assessing
computerized adaptive tests were summarized in a 1984 Journal of Educational Measurement article by Bert Green, Darrell Bock, Lloyd Humphreys, Robert
Linn, and Mark Reckase. The article addressed technical guidelines
related to dimensionality, measurement error, validity, estimation
of item parameters, item pool characteristics, and human factors.
An additional pioneering article in this effort, "Developing Standards
for Computerized Psychological Testing," was written by Paul Hofer
in Computers in Human Behavior in 1985. This article identified the need for standards in the
categories of equivalence, test takers, format and content, and
equipment and procedures. Then, in 1995, Barbara Plake, director
of the Buros Institute for Mental Measurement at the University
of NebraskaLincoln, chaired a task force for the American Council
on Education. That task force developed guidelines for computerized
adaptive test development and use in education. Plakes presentation
at the February 17, 1995, ATP conference noted that the ATP guidelines
were organized consistent with the Standards, provide greater specificity, are more state-of-the-art, and cover
more components of the testing and assessment process.
Summary
Following are some of the key indicators of the growth of computer-based
testing. The area of computer-based testing and assessment is
emerging as a significant professional field in educational measurement.
There have been three major reference editions for educational
measurement: in 1951, 1971, and 1989. The proportion of reference
pages devoted to computer-based or machine-based testing in each
of those three editions was 3 percent in 1951, 8 percent in 1971,
and 14 percent in 1989. By the turn of the millennium, the number
of computerized tests that had been administered in professional
testing and assessment centers reached at least 4.5 million. It
is estimated that Internet-based testing programs have administered
at least several hundred thousand additional computer-based tests.
Some school districts have also installed district-wide computerized
testing systems. New and innovative Internet-based testing systems
are being developed and are available for worldwide assessment
applications.
To show how far we have come, here are some final quotes. Bert
Green, in the conclusion of his comments on the significance and
insights of Fred Mather Lords paper on tailored testing, says:
"The computer has barely started to establish itself in the testing
business. As experience with computer-controlled testing accumulates,
we can expect important changes in the technology of testing.
Most of these changes lie in the future. Lords results, clear-cut
and devastating as they are, will in the end seem a minor skirmish
in the inevitable computer conquest of testing." (Green, 1970,
p. 194)
Finally, in a 1988 article, Samuel Messick states: "Over the next
decade or two, computer and audiovisual technology will dramatically
change the way individuals learn as well as the way they work.
Technology will also have a profound impact on the ways in which
knowledge, aptitudes, competencies, and personal qualities are
assessed and even conceptualized. There will also come a heightened
emphasis on individuality in assessment with a premium on the
adaptive measurement, perhaps even dynamic measurement, of knowledge
structures, skill competencies, personal strategies and styles
as they develop with instruction and experience. But although
the modes and methods of measurement may change, the basic maxims
of measurement, and especially of validity, will retain their
essential character. The key validity issues are the interpretability,
relevance, and utility of scores, the import or value implications
of scores as a basis for action, and the functional worth of scores
in terms of social consequences of their use." (Messick, 1988,
p. 33)
References
American Council on Education. Guidelines for Computerized Adaptive Test Development and Use
in Education. Washington, D.C.: American Council on Education, 1995.
American Educational Research Association, American Psychological
Association, National Council on Measurement in Education. Standards for Educational and Psychological Testing. Washington, D.C.: American Educational Research Association.
1999.
Association of Test Publishers. Guidelines for Computer-Based Testing. Washington, D.C.: Association of Test Publishers, 2000.
Fitzgerald, Cyndy. "Computer-Based Guidelines for the New Millennium."
Paper presented at the Association of Test Publishers Conference
on Computer-Based Testing: Applications for the New Millennium,
February 17, 2000, Carmel Valley Ranch, Carmel, California.
Green, B.F. Jr. "Comments on Tailored Testing." In Computer-Assisted Instruction, Testing and Guidance, pp.184-197, Wayne H. Holtzman, ed. New York: Harper and Row,
1970.
Green, B.F. Jr., R.D. Bock, L.G. Humphreys, R.L. Linn, and M.D.
Reckase. "Technical Guidelines for Assessing Computerized Adaptive
Tests." Journal of Educational Measurement 21, 347360, 1984.
Hambleton, Ronald K. "Computer-Enhanced Assessments: Lots of Promise,
but Many Problems to Be Overcome." Paper presented at the Association
of Test Publishers Conference on Computer-Based Testing: Applications
for the New Millennium, February 17, 2000, Carmel Valley Ranch,
Carmel, California.
Hofer, Paul J. "Developing Standards for Computerized Psychological
Testing." Computers in Human Behavior 1, 301-315, 1985.
Messick, Samuel. "The Once and Future Issues of Validity." In
Test Validity, pp. 3345, Howard Wainer and Henry Braun, eds. Hillsdale, N.J.:
Lawrence Erlbaum, 1988.
Mills, Craig N. "Unlocking the Promise of Computer Based Testing."
Paper and multimedia presentation at the Association of Test Publishers
Conference on Computer-Based Testing: Applications for the New
Millennium, February 17, 2000, Carmel Valley Ranch, Carmel, California.
Olsen, James B. "ATP Computerized Testing Guidelines: Current
Educational Measurement Theory and New Standards." Paper presented
at the Association of Test Publishers Conference on Computer-Based
Testing: Applications for the New Millennium, February 17, 2000,
Carmel Valley Ranch, Carmel, California.
Plake, Barbara S. "Evolution of Guidelines for Computer-Based
Testing." Paper presented at the Association of Test Publishers
Conference on Computer-Based Testing: Applications for the New
Millennium, February 17, 2000, Carmel Valley Ranch, Carmel, California.
About the Author
Dr. James B. Olsen is Chief Scientist at Alpine Media Corporation
in Orem, Utah.
|
 |