> Since I did not know what ROC analysis was, I looked
> around in the Web and started reading about it. I work
> in the manufacturing world, and what I use is the usual
> t-tests, ANOVA, regression, etc. I plan to
> read more on ROC analysis, but from the little that I
> read, I believe this can also be used for industrial
> statistics. It has always being difficult to explain
> statistics to engineers but some of the things that
> I read on comparing populations seems straightforward.
> The question is, is the method adequate for
> applications other than medicine? Could you give me
> the benefit of using this over the usual tests of
> hypothesis?
Most fundamentally, receiver operating characteristic (ROC) analysis
quantifies accuracy in two-group classification tasks in terms of the
relationship, as a critical value is manipulated, between two
conditional probabilities, each of which is conditional upon actual
membership in one or the other of the two groups -- e.g., Prob(classify
as group 2 | actually a member of group 2) and Prob(classify as group 2
| actually a member of group 1). A graphical display of this
relationship constitutes an ROC curve. ROC analysis isn't a way of
testing hypothesies; however, hypothesis-testing methods have been
developed to assess the statistical significance of differences between
estimates of ROC curves or summary indices thereof.
I hope that the reading list appended below may be helpful.
Charles Metz
----------------------------------------
Readings in ROC Analysis, with Emphasis on Medical Applications
Prepared by Charles E. Metz
Department of Radiology
The University of Chicago
BACKGROUND:
Egan JP. Signal detection theory and ROC analysis. New York: Academic
Press, 1975.
Fryback DG, Thornbury JR. The efficacy of diagnostic imaging. Med
Decis Making 1991; 11: 88.
Griner PF, Mayewski RJ, Mushlin AI, Greenland P. Selection and
interpretation of diagnostic tests and procedures: principles and
applications. Annals Int Med 1981; 94: 553.
International Commission on Radiation Units and Measurements. Medical
imaging: the assessment of image quality (ICRU Report 54). Bethesda,MD:
ICRU, 1996.
Lusted LB. Signal detectability and medical decision-making. Science
1971; 171: 1217.
McNeil BJ, Adelstein SJ. Determining the value of diagnostic and
screening tests. J Nucl Med 1976; 17: 439.
McNeil BJ, Keeler E, Adelstein SJ. Primer on certain elements of
medical decision making. New Engl J Med 1975; 293: 211.
Metz CE, Wagner RF, Doi K, Brown DG, Nishikawa RN, Myers KJ. Toward
consensus on quantitative assessment of medical imaging systems. Med
Phys 22: 1057-1061, 1995.
National Council on Radiation Protection and Measurements. An
introduction to efficacy in diagnostic radiology and nuclear medicine
(NCRP Commentary 13). Bethesda, MD: NCRP, 1995.
Robertson EA, Zweig MH, Van Steirtghem AC. Evaluating the clinical
efficacy of laboratory tests. Am J Clin Path 1983; 79: 78.
Zweig MH, Campbell G. Receiver-operating characteristic (ROC) plots: a
fundamental evaluation tool in clinical medicine. Clinical Chemistry
1993; 39: 561. [Erratum published in Clinical Chemistry 1993; 39: 1589.]
GENERAL:
Hanley JA. Alternative approaches to receiver operating characteristic
analysis. Radiology 1988; 168: 568.
Hanley JA. Receiver operating characteristic (ROC) methodology: the
state of the art. Critical Reviews in Diagnostic Imaging 1989; 29: 307.
King JL, Britton CA, Gur D, Rockette HE, Davis PL. On the validity of
the continuous and discrete confidence rating scales in receiver
operating characteristic studies. Invest Radiol 1993; 28: 962.
Metz CE. Basic principles of ROC analysis. Seminars in Nucl Med 1978;
8: 283.
Metz CE. ROC methodology in radiologic imaging. Invest Radiol 1986;
21: 720.
Metz CE. Some practical issues of experimental design and data analysis
in radiological ROC studies. Invest Radiol 1989; 24: 234.
Metz CE. Evaluation of CAD methods. In Computer-Aided Diagnosis in
Medical Imaging (K Doi, H MacMahon, ML Giger and KR Hoffmann, eds.).
Amsterdam: Elsevier Science (Excerpta Medica International Congress
Series, Vol. 1182), pp. 543-554, 1999.
Metz CE. Fundamental ROC analysis. In: Handbook of Medical Imaging,
Vol. 1: Physics and Psychophysics (J Beutel, H Kundel and R Van Metter,
eds.). Bellingham, WA; SPIE Press, 2000, pp. 751-769.
Metz CE, Shen J-H. Gains in accuracy from replicated readings of
diagnostic images: prediction and assessment in terms of ROC analysis.
Med Decis Making 1992; 12: 60.
Rockette HE, Gur D, Metz CE. The use of continuous and discrete
confidence judgments in receiver operating characteristic studies of
diagnostic imaging techniques. Invest Radiol 1992; 27: 169.
Swets JA. ROC analysis applied to the evaluation of medical imaging
techniques. Invest Radiol 1979; 14: 109.
Swets JA. Indices of discrimination or diagnostic accuracy: their ROCs
and implied models. Psychol Bull 1986; 99: 100.
Swets JA. Measuring the accuracy of diagnostic systems. Science 1988;
240: 1285.
Swets JA. Signal detection theory and ROC analysis in psychology and
diagnostics: collected papers. Mahwah, NJ; Lawrence Erlbaum Associates, 1996.
Swets JA, Pickett RM. Evaluation of diagnostic systems: methods from
signal detection theory. New York: Academic Press, 1982.
Wagner RF, Beiden SV, Metz CE. Continuous vs. categorical data for ROC
analysis: Some quantitative considerations. Academic Radiol 2001, 8:
328, 2001.
BIAS:
Begg CB, Greenes RA. Assessment of diagnostic tests when disease
verification is subject to selection bias. Biometrics 1983; 39: 207.
Begg CB, McNeil BJ. Assessment of radiologic tests: control of bias and
other design considerations. Radiology 1988; 167: 565.
Gray R, Begg CB, Greenes RA. Construction of receiver operating
characteristic curves when disease verification is subject to selection
bias. Med Decis Making 1984; 4: 151.
Ransohoff DF, Feinstein AR. Problems of spectrum and bias in evaluating
the efficacy of diagnostic tests. New Engl J Med 1978; 299: 926.
CURVE FITTING:
Dorfman DD, Alf E. Maximum likelihood estimation of parameters of
signal detection theory and determination of confidence intervals
rating method data. J Math Psych 1969; 6: 487.
Dorfman DD, Berbaum KS, Metz CE, Lenth RV, Hanley JA, Dagga HA. Proper
ROC analysis: the bigamma model. Academic Radiol 1997; 4: 138.
Grey DR, Morgan BJT. Some aspects of ROC curve-fitting: normal and
logistic models. J Math Psych 1972; 9: 128.
Hanley JA. The robustness of the "binormal" assumptions used in fitting
ROC curves. Med Decis Making 1988; 8: 197.
Metz CE, Herman BA, Shen J-H. Maximum-likelihood estimation of ROC
curves from continuously-distributed data. Stat Med 1998; 17: 1033.
Metz CE, Pan X. "Proper" binormal ROC curves: theory and
maximum-likelihood estimation. J Math Psych 1999; 43: 1.
Pan X, Metz CE. The "proper" binormal model: parametric ROC curve
estimation with degenerate data. Academic Radiol 1997; 4: 380.
Swensson RG. Unified measurement of observer performance in detecting
and localizing target objects on images. Med Phys 1996; 23: 1709.
Swets JA. Form of empirical ROCs in discrimination and diagnostic
tasks: implications for theory and measurement of performance. Psychol
Bull 1986; 99: 181.
STATISTICS:
Agresti A. A survey of models for repeated ordered categorical response
data. Statistics in Medicine 1989; 8; 1209.
Bamber D. The area above the ordinal {*filter*} graph and the area below
the receiver operating graph. J Math Psych 1975; 12: 387.
Beiden SV, Wagner RF, Campbell G. Components-of-variance models and
multiple-bootstrap experiments: and alternative method for
random-effects, receiver operating characteristic analysis. Academic
Radiol. 2000; 7: 341.
Beiden SV, Wagner RF, Campbell G, Metz CE, Jiang Y.
Components-of-variance models for random-effects ROC analysis: The case
of unequal variance structures across modalities. Academic Radiol.
2001; 8: 605.
Beiden SV, Wagner RF, Campbell G, Chan H-P. Analysis of uncertainties
in estimates of components of variance in multivariate ROC analysis.
Academic Radiol. 2001; 8: 616.
DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two
or more correlated receiver operating characteristic curves: a
nonparametric approach. Biometrics 1988; 44: 837.
Dorfman DD, Berbaum KS, Metz CE. ROC rating analysis: generalization to
the population of readers and cases with the jackknife method. Invest
Radiol 1992; 27: 723.
Dorfman DD, Berbaum KS, Lenth RV, Chen Y-F, Donaghy BA. Monte Carlo
validation of a multireader method for receiver operating characteristic
discrtet rating data: factorial experimental design. Academic Radiol
1998; 5: 591.
Dorfman DD, Metz CE. Multi-reader multi-case ROC analysis: comments on
Beggs commentary. Academic Radiol 1995; 2 (Supplement 1): S76.
Hanley JA, McNeil BJ. The meaning and use of the area under a receiver
operating characteristic (ROC) curve. Radiology 1982; 143: 29.
Hanley JA, McNeil BJ. A method of comparing the areas under receiver
operating characteristic curves derived from the same cases. Radiology
1983; 148: 839.
Jiang Y, Metz CE, Nishikawa RM. A receiver operating characterisitc
partial area index for highly sensitive diagnostic tests. Radiology
1996; 201: 745.
Ma G, Hall WJ. Confidence bands for receiver operating characteristic
curves. Med Decis Making 1993; 13: 191.
McClish DK. Analyzing a portion of the ROC curve. Med Decis Making
1989; 9: 190.
McClish DK. Determining a range of false-positive rates for which ROC
curves differ. Med Decis Making 1990; 10: 283.
McNeil BJ, Hanley JA. Statistical approaches to the analysis of
receiver operating characteristic (ROC) curves. Med Decis Making 1984;
4: 137.
Metz CE. Statistical analysis of ROC data in evaluating
...
read more »