computerized classification test

A computerized classification test (CCT) refers to a Performance Appraisal System that is administered by computer for the purpose of classifying examinees. The most common CCT is a mastery test where the test classifies examinees as "Pass" or "Fail," but the term also includes tests that classify examinees into more than two categories. While the term may generally be considered to refer to all computer-administered tests for classification, it is usually used to refer to tests that are interactively administered or of variable-length, similar to computerized adaptive testing (CAT). Like CAT, variable-length CCTs can accomplish the goal of the test (accurate classification) with a fraction of the number of items used in a conventional fixed-form test.

A CCT requires several components:

  1. An item bank calibrated with a psychometric model selected by the test designer
  2. A starting point
  3. An item selection algorithm
  4. A termination criterion and scoring procedure

The starting point is not a topic of contention; research on CCT primarily investigates the application of different methods for the other three components. Note: The termination criterion and scoring procedure are separate in CAT, but the same in CCT because the test is terminated when a classification is made. Therefore, there are five components that must be specified to design a CAT.

An introduction to CCT is found in Thompson (2007){{cite journal|last=Thompson|first=N. A.|year=2007|title=A Practitioner's Guide for Variable-length Computerized Classification Testing|journal=Practical Assessment Research & Evaluation|volume=12|issue=1|url=http://pareonline.net/getvn.asp?v=12&n=1|archive-url=https://web.archive.org/web/20071021213959/http://pareonline.net/pdf/v12n1.pdf|archive-date=21 October 2007|url-status=dead}} and a book by Parshall, Spray, Kalohn and Davey (2006).{{cite book|last1=Parshall|first1=C. G.|last2=Spray|first2=J. A.|last3=Kalohn|first3=J. C.|last4=Davey|first4=T.|year=2006|title=Practical considerations in computer-based testing|location=New York|publisher=Springer}} A bibliography of published CCT research is found below.

How it works

A CCT is very similar to a CAT. Items are administered one at a time to an examinee. After the examinee responds to the item, the computer scores it and determines if the examinee is able to be classified yet. If they are, the test is terminated and the examinee is classified. If not, another item is administered. This process repeats until the examinee is classified or another ending point is satisfied (all items in the bank have been administered, or a maximum test length is reached).

Psychometric model

Two approaches are available for the psychometric model of a CCT: classical test theory (CTT) and item response theory (IRT). Classical test theory assumes a state model because it is applied by determining item parameters for a sample of examinees determined to be in each category. For instance, several hundred "masters" and several hundred "non-masters" might be sampled to determine the difficulty and discrimination for each, but doing so requires that you be able to easily identify a distinct set of people that are in each group. IRT, on the other hand, assumes a trait model; the knowledge or ability measured by the test is a continuum. The classification groups will need to be more or less arbitrarily defined along the continuum, such as the use of a cutscore to demarcate masters and non-masters, but the specification of item parameters assumes a trait model.

There are advantages and disadvantages to each. CTT offers greater conceptual simplicity. More importantly, CTT requires fewer examinees in the sample for calibration of item parameters to be used eventually in the design of the CCT, making it useful for smaller testing programs. See Frick (1992){{cite journal|last=Frick|first=T.|year=1992|title=Computerized Adaptive Mastery Tests as Expert Systems|journal=Journal of Educational Computing Research|volume=8|issue=2|pages=187-213|doi=10.2190/J87V-6VWP-52G7-L4XX}} for a description of a CTT-based CCT. Most CCTs, however, utilize IRT. IRT offers greater specificity, but the most important reason may be that the design of a CCT (and a CAT) is expensive, and is therefore more likely done by a large testing program with extensive resources. Such a program would likely use IRT.

Starting point

A CCT must have a specified starting point to enable certain algorithms. If the sequential probability ratio test is used as the termination criterion, it implicitly assumes a starting ratio of 1.0 (equal probability of the examinee being a master or non-master). If the termination criterion is a confidence interval approach, a specified starting point on theta must be specified. Usually, this is 0.0, the center of the distribution, but it could also be randomly drawn from a certain distribution if the parameters of the examinee distribution are known. Also, previous information regarding an individual examinee, such as their score the last time they took the test (if re-taking) may be used.

Item selection

In a CCT, items are selected for administration throughout the test, unlike the traditional method of administering a fixed set of items to all examinees. While this is usually done by individual item, it can also be done in groups of items known as testlets (Leucht & Nungester, 1996;{{cite journal|last1=Luecht|first1=R. M.|last2=Nungester|first2=R. J.|year=1998|title=Some practical examples of computer-adaptive sequential testing|journal=Journal of Educational Measurement|volume=35|pages=229-249|doi=10.1111/j.1745-3984.1998.tb00537.x}} Vos & Glas, 2000{{cite book|last1=Vos|first1=H.J.|last2=Glas|first2=C.A.W.|year=2000|chapter=Testlet-based adaptive mastery testing|editor1-last=van der Linden|editor1-first=W.J.|editor2-last=Glas|editor2-first=C.A.W.|title=Computerized Adaptive Testing: Theory and Practice|doi=10.1007/0-306-47531-6_15}}).

Methods of item selection fall into two categories: cutscore-based and estimate-based. Cutscore-based methods (also known as sequential selection) maximize the information provided by the item at the cutscore, or cutscores if there are more than one, regardless of the ability of the examinee. Estimate-based methods (also known as adaptive selection) maximize information at the current estimate of examinee ability, regardless of the location of the cutscore. Both work efficiently, but the efficiency depends in part on the termination criterion employed. Because the sequential probability ratio test only evaluates probabilities near the cutscore, cutscore-based item selection is more appropriate. Because the confidence interval termination criterion is centered around the examinees ability estimate, estimate-based item selection is more appropriate. This is because the test will make a classification when the confidence interval is small enough to be completely above or below the cutscore (see below). The confidence interval will be smaller when the standard error of measurement is smaller, and the standard error of measurement will be smaller when there is more information at the theta level of the examinee.

Termination criterion

There are three termination criteria commonly used for CCTs. Bayesian decision theory methods offer great flexibility by presenting an infinite choice of loss/utility structures and evaluation considerations, but also introduce greater arbitrariness. A confidence interval approach calculates a confidence interval around the examinee's current theta estimate at each point in the test, and classifies the examinee when the interval falls completely within a region of theta that defines a classification. This was originally known as adaptive mastery testing {{harv|Kingsbury|Weiss|1983}}, but does not necessarily require adaptive item selection, nor is it limited to the two-classification mastery testing situation. The sequential probability ratio test {{harv|Reckase|1983}} defines the classification problem as a hypothesis test that the examinee's theta is equal to a specified point above the cutscore or a specified point below the cutscore.

References

{{Reflist}}

Bibliography of CCT research

{{refbegin}}

  • {{Cite journal |last=Armitage |first=P. |year=1950 |title=Sequential analysis with more than two alternative hypotheses, and its relation to discriminant function analysis |journal=Journal of the Royal Statistical Society |volume=12 |pages=137–144}}
  • {{Cite book |last=Braun |first=H. |last2=Bejar |first2=I.I. |last3=Williamson |first3=D.M. |year=2006 |chapter=Rule-based methods for automated scoring: Application in a licensing context |editor-last=Williamson |editor-first=D.M. |editor2-last=Mislevy |editor2-first=R.J. |editor3-last=Bejar |editor3-first=I.I. |title=Automated scoring of complex tasks in computer-based testing |location=Mahwah, NJ |publisher=Erlbaum}}
  • {{Cite journal |last=Dodd |first=B.G. |last2=De Ayala |first2=R.J. |last3=Koch |first3=W.R. |year=1995 |title=Computerized adaptive testing with polytomous items |journal=Applied Psychological Measurement |volume=19 |pages=5-22}}
  • {{Cite journal |last=Eggen |first=T.J.H.M. |year=1999 |title=Item selection in adaptive testing with the sequential probability ratio test |journal=Applied Psychological Measurement |volume=23 |pages=249–261}}
  • {{Cite journal |last=Eggen |first=T.J.H.M. |last2=Straetmans |first2=G.J.J.M. |year=2000 |title=Computerized adaptive testing for classifying examinees into three categories |journal=Educational and Psychological Measurement |volume=60 |pages=713–734}}
  • {{Cite conference |last=Epstein |first=K.I. |last2=Knerr |first2=C.S. |year=1977 |title=Applications of sequential testing procedures to performance testing |conference=1977 Computerized Adaptive Testing Conference |location=Minneapolis, MN}}
  • {{Cite thesis |last=Ferguson |first=R.L. |year=1969 |title=The development, implementation, and evaluation of a computer-assisted branched test for a program of individually prescribed instruction |type=PhD, unpublished |publisher=University of Pittsburgh}}
  • {{Cite journal |last=Frick |first=T.W. |year=1989 |title=Bayesian adaptation during computer-based tests and computer-guided exercises |journal=Journal of Educational Computing Research |volume=5 |pages=89–114}}
  • {{Cite journal |last=Frick |first=T.W. |year=1990 |title=A comparison of three decisions models for adapting the length of computer-based mastery tests |journal=Journal of Educational Computing Research |volume=6 |pages=479–513}}
  • {{Cite journal |last=Frick |first=T.W. |year=1992 |title=Computerized adaptive mastery tests as expert systems |journal=Journal of Educational Computing Research |volume=8 |pages=187–213}}
  • {{Cite report |last=Huang |first=C.-Y. |last2=Kalohn |first2=J.C. |last3=Lin |first3=C.-J. |last4=Spray |first4=J. |year=2000 |title=Estimating Item Parameters from Classical Indices for Item Pool Development with a Computerized Classification Test |type=Research Report 2000–4 |location=Iowa City, IA |publisher=ACT, Inc.}}
  • {{Cite thesis |last=Jacobs-Cassuto |first=M.S. |year=2005 |title=A Comparison of Adaptive Mastery Testing Using Testlets With the 3-Parameter Logistic Model |type=PhD, unpublished |publisher=University of Minnesota, Minneapolis, MN}}
  • {{Cite conference |last=Jiao |first=H. |last2=Lau |first2=A.C. |title=The Effects of Model Misfit in Computerized Classification Test |conference=Annual meeting of the National Council of Educational Measurement |location=Chicago, IL |date=April 2003}}
  • {{Cite conference |last=Jiao |first=H. |last2=Wang |first2=S. |last3=Lau |first3=C.A. |title=An Investigation of Two Combination Procedures of SPRT for Three-category Classification Decisions in Computerized Classification Test |conference=Annual meeting of the American Educational Research Association |location=San Antonio |date=April 2004}}
  • {{Cite journal |last=Kalohn |first=J.C. |last2=Spray |first2=J.A. |year=1999 |title=The effect of model misspecification on classification decisions made using a computerized test |journal=Journal of Educational Measurement |volume=36 |pages=47–59}}
  • {{Cite report |last=Kingsbury |first=G.G. |last2=Weiss |first2=D.J. |year=1979 |title=An adaptive testing strategy for mastery decisions |type=Research report 79–05 |location=Minneapolis |publisher=University of Minnesota, Psychometric Methods Laboratory}}
  • {{Cite book |last=Kingsbury |first=G.G. |last2=Weiss |first2=D.J. |year=1983 |chapter=A comparison of IRT-based adaptive mastery testing and a sequential mastery testing procedure |editor-link=David J. Weiss |editor-first=D.J. |editor-last=Weiss |title=New horizons in testing: Latent trait theory and computerized adaptive testing |pages=237–254 |location=New York |publisher=Academic Press}}
  • {{Cite thesis |last=Lau |first=C.A. |year=1996 |title=Robustness of a unidimensional computerized testing mastery procedure with multidimensional testing data |type=PhD, unpublished |publisher=University of Iowa, Iowa City IA}}
  • {{Cite conference |last=Lau |first=C.A. |last2=Wang |first2=T. |year=1998 |title=Comparing and combining dichotomous and polytomous items with SPRT procedure in computerized classification testing |conference=Annual meeting of the American Educational Research Association |location=San Diego}}
  • {{Cite conference |last=Lau |first=C.A. |last2=Wang |first2=T. |year=1999 |title=Computerized classification testing under practical constraints with a polytomous model |conference=Annual meeting of the American Educational Research Association |location=Montreal, Canada}}
  • {{Cite conference |last=Lau |first=C.A. |last2=Wang |first2=T. |year=2000 |title=A new item selection procedure for mixed item type in computerized classification testing |conference=Annual meeting of the American Educational Research Association |location=New Orleans, Louisiana}}
  • {{Cite journal |last=Lewis |first=C. |last2=Sheehan |first2=K. |year=1990 |title=Using Bayesian decision theory to design a computerized mastery test |journal=Applied Psychological Measurement |volume=14 |pages=367–386}}
  • {{Cite report |last=Lin |first=C.-J. |last2=Spray |first2=J.A. |year=2000 |title=Effects of item-selection criteria on classification testing with the sequential probability ratio test |type=Research Report 2000–8 |location=Iowa City, IA |publisher=ACT, Inc.}}
  • {{Cite journal |last=Linn |first=R.L. |last2=Rock |first2=D.A. |last3=Cleary |first3=T.A. |year=1972 |title=Sequential testing for dichotomous decisions |journal=Educational & Psychological Measurement |volume=32 |pages=85–95}}
  • {{Cite journal |last=Luecht |first=R.M. |year=1996 |title=Multidimensional Computerized Adaptive Testing in a Certification or Licensure Context |journal=Applied Psychological Measurement |volume=20 |pages=389–404}}
  • {{Cite book |authorlink=Mark Reckase |last=Reckase |first=M.D. |year=1983 |chapter=A procedure for decision making using tailored testing |editor-first=D.J. |editor-last=Weiss |title=New horizons in testing: Latent trait theory and computerized adaptive testing |pages=237–254 |location=New York |publisher=Academic Press}}
  • {{Cite conference |last=Rudner |first=L.M. |title=An examination of decision-theory adaptive testing procedures |conference=Annual meeting of the American Educational Research Association |date=1–5 April 2002 |location=New Orleans, LA}}
  • {{Cite journal |last=Sheehan |first=K. |last2=Lewis |first2=C. |year=1992 |title=Computerized mastery testing with nonequivalent testlets |journal=Applied Psychological Measurement |volume=16 |pages=65–76}}
  • {{Cite report |last=Spray |first=J.A. |year=1993 |title=Multiple-category classification using a sequential probability ratio test |type=Research Report 93–7 |location=Iowa City, Iowa |publisher=ACT, Inc.}}
  • {{Cite report |last=Spray |first=J.A. |last2=Abdel-fattah |first2=A.A. |last3=Huang |first3=C. |last4=Lau |first4=C.A. |year=1997 |title=Unidimensional approximations for a computerized test when the item pool and latent space are multidimensional |type=Research Report 97–5 |location=Iowa City, Iowa |publisher=ACT, Inc.}}
  • {{Cite report |last=Spray |first=J.A. |last2=Reckase |first2=M.D. |year=1987 |title=The effect of item parameter estimation error on decisions made using the sequential probability ratio test |type=Research Report 87–17 |location=Iowa City, IA |publisher=ACT, Inc.}}
  • {{Cite conference |last=Spray |first=J.A. |last2=Reckase |first2=M.D. |title=The selection of test items for decision making with a computerized adaptive test |conference=Annual Meeting of the National Council for Measurement in Education |location=New Orleans, LA |date=5–7 April 1994}}
  • {{Cite journal |last=Spray |first=J.A. |last2=Reckase |first2=M.D. |year=1996 |title=Comparison of SPRT and sequential Bayes procedures for classifying examinees into two categories using a computerized test |journal=Journal of Educational & Behavioral Statistics |volume=21 |pages=405–414}}
  • {{Cite journal |last=Thompson |first=N.A. |year=2006 |title=Variable-length computerized classification testing with item response theory |journal=CLEAR Exam Review |volume=17 |issue=2}}
  • {{Cite journal |last=Vos |first=H.J. |year=1998 |title=Optimal sequential rules for computer-based instruction |journal=Journal of Educational Computing Research |volume=19 |pages=133–154}}
  • {{Cite journal |last=Vos |first=H.J. |year=1999 |title=Applications of Bayesian decision theory to sequential mastery testing |journal=Journal of Educational and Behavioral Statistics |volume=24 |pages=271–292}}
  • {{Cite book |last=Wald |first=A. |year=1947 |title=Sequential analysis |location=New York |publisher=Wiley}}
  • {{Cite journal |last=Weiss |first=D.J. |last2=Kingsbury |first2=G.G. |year=1984 |title=Application of computerized adaptive testing to educational problems |journal=Journal of Educational Measurement |volume=21 |pages=361–375}}
  • {{Cite conference |last=Weissman |first=A. |year=2004 |title=Mutual information item selection in multiple-category classification CAT |conference=Annual Meeting of the National Council for Measurement in Education |location=San Diego, CA}}
  • {{Cite journal |last=Weitzman |first=R.A. |year=1982a |title=Sequential testing for selection |journal=Applied Psychological Measurement |volume=6 |pages=337–351}}
  • {{Cite conference |last=Weitzman |first=R.A. |year=1982b |title=Use of sequential testing to prescreen prospective entrants into military service |editor-first=D. J. |editor-last=Weiss |conference=Proceedings of the 1982 Computerized Adaptive Testing Conference |location=Minneapolis, MN |publisher=University of Minnesota, Department of Psychology, Psychometric Methods Program}}

{{refend}}