|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Reviews |
1 Departments of Biostatistics and Epidemiology and the Division of Radiology-Wb4, The Cleveland Clinic Foundation, Cleveland, OH. 2 Department of Pathology, University of Texas Southwestern Medical Center, Dallas, TX.
aAddress correspondence to this author at: Department of Biostatistics, The Cleveland Clinic Foundation, 9500 Euclid Ave., Cleveland, OH 44195-5196. Fax 216-444-3466; e-mail nobuchow{at}bio.ri.ccf.org.
Background: ROC curves have become the standard for describing and comparing the accuracy of diagnostic tests. Not surprisingly, ROC curves are used often by clinical chemists. Our aims were to observe how the accuracy of clinical laboratory diagnostic tests is assessed, compared, and reported in the literature; to identify common problems with the use of ROC curves; and to offer some possible solutions.
Methods: We reviewed every original work using ROC curves and published in Clinical Chemistry in 2001 or 2002. For each article we recorded phase of the research, prospective or retrospective design, sample size, presence/absence of confidence intervals (CIs), nature of the statistical analysis, and major analysis problems.
Results: Of 58 articles, 31% were phase I (exploratory), 50% were phase II (challenge), and 19% were phase III (advanced) studies. The studies increased in sample size from phase I to III and showed a progression in the use of prospective designs. Most phase I studies were powered to assess diagnostic tests with ROC areas
0.70. Thirty-eight percent of studies failed to include CIs for diagnostic test accuracy or the CIs were constructed inappropriately. Thirty-three percent of studies provided insufficient analysis for comparing diagnostic tests. Other problems included dichotomization of the gold standard scale and inappropriate analysis of the equivalence of two diagnostic tests.
Conclusion: We identify available software and make some suggestions for sample size determination, testing for equivalence in diagnostic accuracy, and alternatives to a dichotomous classification of a continuous-scale gold standard. More methodologic research is needed in areas specific to clinical chemistry.
The following articles in journals at HighWire Press have cited this article:
![]() |
A. Capderou, M. Berkani, M.-H. Becquemin, and M. Zelter A Method to Derive Lower Limit of Normal for the FEV1/Forced Expiratory Volume at 6 s of Exhalation Ratio From FEV1/FVC Data Chest, February 1, 2009; 135(2): 408 - 418. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. H. Wians Jr. Clinical Laboratory Tests: Which, Why, and What Do The Results Mean? Lab Med, February 1, 2009; 40(2): 105 - 113. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Soreide Receiver-operating characteristic curve analysis in diagnostic, prognostic and predictive biomarker research J. Clin. Pathol., January 1, 2009; 62(1): 1 - 5. [Full Text] [PDF] |
||||
![]() |
P. Venge, S. James, L. Jansson, and B. Lindahl Clinical Performance of Two Highly Sensitive Cardiac Troponin I Assays Clin. Chem., January 1, 2009; 55(1): 109 - 116. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. J. Koopman, A. G. Mainous III, C. J. Everett, and R. E. Carter Tool to Assess Likelihood of Fasting Glucose ImpairmenT (TAG-IT) Ann. Fam. Med, November 1, 2008; 6(6): 555 - 561. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. F M Linssen, O. Bekers, M. Drent, and J. A Jacobs C-reactive protein and procalcitonin concentrations in bronchoalveolar lavage fluid as a predictor of ventilator-associated pneumonia Ann Clin Biochem, May 1, 2008; 45(3): 293 - 298. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. M.G. Leeflang, K. G.M. Moons, J. B. Reitsma, and A. H. Zwinderman Bias in Sensitivity and Specificity Caused by Data-Driven Selection of Optimal Cutoff Values: Mechanisms, Magnitude, and Solutions Clin. Chem., April 1, 2008; 54(4): 729 - 737. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Coenen, P. Verschueren, R. Westhovens, and X. Bossuyt Technical and Diagnostic Performance of 6 Assays for the Measurement of Citrullinated Protein/Peptide Antibodies in the Diagnosis of Rheumatoid Arthritis Clin. Chem., March 1, 2007; 53(3): 498 - 504. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. A. DeBari Computation of decision levels from differentiated logistic regression probability curves. Ann. Clin. Lab. Sci., March 1, 2006; 36(2): 194 - 200. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. C. Adams, D. M. Reboussin, C. Leiendecker-Foster, G. C. Moses, G. D. McLaren, C. E. McLaren, F. W. Dawkins, I. Kasvosve, R. T. Acton, J. C. Barton, et al. Comparison of the Unsaturated Iron-Binding Capacity with Transferrin Saturation as a Screening Test to Detect C282Y Homozygotes for Hemochromatosis in 101 168 Participants in the Hemochromatosis and Iron Overload Screening (HEIRS) Study Clin. Chem., June 1, 2005; 51(6): 1048 - 1052. [Full Text] [PDF] |
||||
![]() |
T. Keller, H. Butz, M. Lein, M. Kwiatkowski, A. Semjonow, H.-J. Luboldt, P. Hammerer, C. Stephan, and K. Jung Discordance Analysis Characteristics as a New Method to Compare the Diagnostic Accuracy of Tests: Example of Complexed Versus Total Prostate-Specific Antigen Clin. Chem., March 1, 2005; 51(3): 532 - 539. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Hilden and N. Obuchowski What Properties Should an Overall Measure of Test Performance Possess? * Dr. Obuchowski responds: Clin. Chem., February 1, 2005; 51(2): 471 - 472. [Full Text] [PDF] |
||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |