|
|
||||||||
Book, Software, and Web Site Reviews |
Emeritus Professor of Biochemistry University of Western Ontario London, Ontario N6A 5C1, Canada
Statistical Computing, An Introduction to Data Analysis Using S-Plus. Michael J. Crawley. Chichester, UK: Wiley UK, 2002 (reprinted, with corrections, March 2003), 772 pp., $85.00, hardcover. ISBN 0-471-56040-5.
The Statistical Evaluation of Medical Tests for Classification and Prediction. Margaret S. Pepe. New York: Oxford University Press, 2003, 320 pp., $115.00, hardcover. ISBN 0-19-850984-7.
Introductory Biostatistics. Chap T. Le. Hoboken, NJ: Wiley-Interscience, A John Wiley & Sons Publication, 2003, 572 pp., $94.95, hardcover. ISBN 0-471-41816-1.
You always need to know ten times as much as you use
Quoted by E.A. Murphy in Biostatistics in Medicine (1982)
Teaching data analysis is not easy, and the time, allowed is always far from sufficient
J.W. Tukey (1962)
As junior medical students in Glasgow in the 1950s, we learned statistics from a poor lecturer and an excellent book (M.J. Moroney, Facts from Figures, Penguin, 1951). Unfortunately this was before the availability of calculators, and therefore, no effort was made to use these techniques to examine the results we were currently producing in the practical classes of physiology and biochemistry. We therefore benefited little from that early exposure except to remember the t-distribution because of its association with beer.
I thought of Moroneys book when I opened Crawleys Statistical Computing. Both volumes are written in lively style; are lucid, comprehensive, and rigorous; and are illuminated throughout with flashes of dry humor. Crawley is a distinguished ecologist and a member of the Department of Biological Sciences at Imperial College in London, England, and has been involved in the teaching and research applications of statistical techniques for many years. One difference between Facts from Figures and Statistical Computing is now the ready availability of the computer, which allows a reader of the latter text to examine data and assess their statistical significance while using the book.
Crawley writes "This book is intended as both an introduction to and a reference manual for statistics and computing. It assumes nothing by way of background in either subject, and starts from absolute basics. All it takes for granted is an enthusiasm to learn. It covers everything from the simplest non-parametric techniques (e.g., the runs test) up to the most advanced modern methods (e.g., mixed effects modelling)". Later he insists "This is a statistics book for non-statisticians". He adds "The comments of successive generations of students on the annual statistical computing course have greatly improved the clarity of the presentation, and have helped me to understand which bits of statistical modelling and computing are particularly daunting for beginners".
The subtitle mentions that the data analyses used in the book require S-Plus, which is an interactive programming environment for data analysis and graphics. It originated from the S language (a system for programming with data) developed at AT&Ts (now Lucent Technologies) Bell Laboratories and is commercially available (www.insightful.com). An independent implementation of the S language is also available as open source (and free of charge) software called R (www. r-project.org). If you do not have access to S-Plus, you can use R, although there are some differences between these languages and Crawley addresses these in his web page (www.bio.ic.ac.uk/research/mjcraw/statcomp/).
While reading Crawleys text I wondered what statistical programs contributors to Clinical Chemistry most often used. A text word search of the journals archives (from September 1965 onward) revealed that various SPSS statistical programs (www.spss.com) were used in 166 papers, whereas SAS (www.sas.com) was used in 133 contributions and Stata (www.stata.com) in 16 articles; fewer than 10 articles used S-Plus.
Crawley suggests that learning S-Plus will change the way one does statistics, but it will not be easy. Of relevance is a quote by Dr. Terry Therneau (Mayo Clinic) "Something that will take me 24 h in S-Plus, will take me 24 days in SAS" (1). My experience of Crawleys technique is that his clarity, his profound statistical insights, his facility with S-Plus, and his stepwise approach to even the most complex statistical procedure is illuminating in a fashion that other S-Plus books often fail to achieve, and it demonstrates how valuable student comments are, as noted earlier, when directed to a committed teacher.
The book has 36 chapters, nearly all of which are accompanied by a list for further reading, and the book concludes with an extensive bibliography and an extremely comprehensive, very useful, 27-page index containing all of the S-Plus commands used in the text. Essentially there are four parts to the book: the elements of statistics (central tendency, probability, variance, normal distribution, classical tests, regression, and ANOVA); an extension of these elements (ANCOVA and the main classes of generalized linear models); more advanced topics [further examination of the generalized linear models (GLM) and more advanced aspects of ANOVA]; and finally, a host of advanced techniques, such as bootstrap and jackknife, tree models, nonparametric smoothing, survival analysis, time series analysis, mixed-effects models, and spatial statistics. Several chapters amount to a minicourse on sound statistical techniques: statistical methods; experimental design; power calculations; statistical models in S-Plus; understanding data (graphical and tabular analysis); model criticism and simplification; and graphs, functions, and transformations. As concepts develop, page references are provided for more advanced treatment later in the text that in turn are usefully back-referenced, making it convenient for review.
Crawley, as noted previously, has a web page that includes all downloadable data files and programs (script files) used in his book, making it easy to perform the examples without having to type (S-Plus has a very useful function, the up-arrow key, that reproduces all previous commands, thus saving much time). He also intends to have corrections, exercises, and additional chapters available on his book page. This is a book with a real future.
What topics were not covered? S-Plus provides several functions for performing quality control. Disappointingly, Crawley did not address these routines. Moroney, incidentally, examined problems of quality control at length. Another omission, in my view, was ROC, which is an important part of practical statistical activity. Crawley explains Bayes theorem in terms of conditional probability but omits the easier likelihood ratio approach, which is the form most used in medical diagnostics. I have previously mentioned the comprehensive, and excellent, subject index, so I was disappointed to see that literature references were not indexed to the chapters in which they were cited. This type of indexing is extremely valuable when following up on a citation (or, alternatively, the provision of an author index). An example of this useful practice can be viewed in Barnett & Lewiss Outliers in Statistical Data, 3rd edition. Although Crawley deals with outliers, he did not cite this seminal text. I experienced errors and omissions in some of the scripts that accordingly did not produce the anticipated results; these, however, forced effective problem solving. Finally, any user of S-Plus soon runs into cryptic error messages when ones enthusiasm outruns ones expertise. It would have been valuable to have Crawleys advice on how to cope with errors.
I believe that this book is an outstanding and masterful contribution to statistical thought. It will be of immense value for trainees and established workers alike in the field of clinical chemistry.
Pepes book is an addition to the literature on ROC analysis until now represented in volumes by J.P. Egan (Signal Detection Theory and ROC Analysis, 1975), J.A. Swets and R.M. Pickett (Evaluation of Diagnostic Systems: Methods from Signal Detection Theory, 1982), and H.C. Kraemer (Evaluating Medical Tests: Objective and Quantitative Guidelines, 1992). Her book contains nine chapters, end of chapter exercises (but no solutions), a bibliography, and a subject index, but it sadly lacks an author index. She is donating book royalties to the charity Doctors Without Borders. Pepe is a Professor of Biostatistics at the University of Washington and the Fred Hutchinson Cancer Research Center. The data sets, and the Stata programs used in the book, can be accessed on-line (www.fhcrc.org/labs/pepe/book). Pepe sets out "to provide a systematic framework for the statistical theory and practice of research studies that seek to evaluate clinical tests used in the practice of medicine". She hopes that it will be found useful for "practising biostatisticians and more academic research biostatisticians". Clinical chemists? Laboratory physicians?
The opening chapter outlines the criteria for a useful diagnostic/screening test and the elements of study design. The seminal paper by Ransohoff and Feinstein (2), although the basis for much of this section, was not cited. The chapter closes with a description of the seven valuable data sets used to illustrate the books methodologies. The following chapter deals with measures of accuracy for binary tests, i.e., diseased/nondiseased and tested positive/tested negative. The three measures of diagnostic accuracy (disease-specific classification probabilitiestrue and false positive fractions, predictive valuespositive and negative predictive values, and likelihood ratios) are illustrated with one of the available data sets containing a cohort study of 1465 individuals. The relative merits of each measure are discussed and illustrated in an extremely useful tabulation. Likelihood ratios have received increasing attention because they quantify the increase of knowledge about the presence (or absence) of disease through the diagnostic testing process. Pepe warns, however, of a growing realization that for a variety of reasons, discussed more fully later, there is no basis for the assumption regarding the constancy of test sensitivity and specificity. What Kraemer (cited earlier) calls "the myth".
Chapters three and six address regression modeling framework. Pepe identifies a range of factors that may, potentially, affect test performance, such the age and gender of the tested individual, the conditions under which tests are administered and run, and of course, the disease manifestations and the nondisease state [Ref. (2) again, but not cited]. Regression analysis may be used to assess the importance of such factors as well as to compare different paired (preferred) or unpaired test results. The advantage of regression modeling is that the analysis can control for concomitant factors. To appreciate the power of such techniques it is necessary to become familiar with the concepts of GLM and generalized estimating equations (GEE). Unfortunately, these techniques are not described in the standard introductory medical statistics texts, and more advanced texts (B. Everitt, Modern Medical Statistics: A Practical Guide, 2003) have to be consulted (Crawleys Statistical Computing, reviewed above, also provides considerable advice on modeling techniques). These approaches can be applied to all three of the diagnostic accuracy measures described earlier.
The next two chapters deal with the ROC curve and will be familiar to most laboratory workers. This techniquedescribed by Pepe as "the best-developed statistical tool for describing the performance of [continuous or ordinal scaled] tests"has been in use for many decades. Pepe insists that "ROC curves have nothing to do with the particular distributions of the tests results but rather quantify the relationships between distributions". The most common summary index for ROC curves is the area under the curve, but there are other frequently used indices: a specific ROC point, partial area under the curve, symmetry point (where sensitivity = specificity), and the KolmogorovSmirnov index (the maximum vertical distance between the ROC curve and the 45° line).
Pepe devotes a whole chapter to the problem of incomplete data and imperfect reference tests, a recurring practical and theoretical irritation during test assessments. She considers three scenarios: verification-biased sampling, verification restricted to screen positives, and imperfect reference tests. When screened-positive tested cases are verified for disease (but not screened-negative cases), this verification bias always produces an increase in sensitivity and a decrease in specificity compared with their true values. This bias may be corrected by application of Bayes theorem [the Begg and Greenes estimates (3)(4)]. Extreme verification bias results when the gold standard test is applied only to cases at high risk of disease. The approach to this problem is theoretically more complex and probably not entirely satisfactory. Imperfect reference tests are well known. Two examples are culturing the organism causing an infection and obtaining a cancerous biopsy specimen. In each case, improvement in molecular biological and imaging techniques are making these sources of error less common, but they do still occur. Two suggested approaches are use of Bayesian methodology and latent class analysis.
The penultimate chapter examines the five phases of the research required for the development of a test and the importance of suitable sample sizes. Pepe concludes the volume with consideration of metaanalysis [using the Moses algorithmthe summary ROC, sROC, curve (5)], incorporating the time dimension using the disease state as a time-dependent variable, and a discussion of combining information from multiple tests.
Although there were several mentions of the Zweig and Campbell review (6), no reference was made to Shultzs important multiROC approach (7). One annoying feature of this text was the extremely poor indexing. For example, although terms such as sensitivity and specificity (or their synonyms) occur throughout the text, they are not readily reached by use of the index, and although the term "bootstrap" is mentioned many times in the text, it is not indexed at all. The term "bronze standard" (a less reliable gold standard) was incorrectly indexed.
In summary, Pepes book is a useful but demanding introduction to the present status of the field. Potential users of this volume might like to consider Statistical Methods in Diagnostic Medicine (2002) by Zhou, Obuchowski, and McClish as an alternative entry to the current literature.
Les book, Introductory Biostatistics, incorporates much of the content of his earlier text (Health and Numbers: A Problem-Based Introduction to Biostatistics) with the aim of providing an introductory text for students of the human health disciplines. Le is Distinguished Professor of Biostatistics and Director of Biostatistics, Comprehensive Cancer Center, University of Minnesota. There are 12 chapters, each with a set of very extensive and useful exercises (many are provided with comprehensive answers at the end of the book). Le opens, unconventionally, with a chapter on methods of categorical data, introducing concepts of proportions and rates with many illustrative examples. The terms "test sensitivity" and "specificity" are defined, but the use of a diagram to illustrate their interrelationship would have been more useful as a teaching tool than the rather unilluminating text. The chapter also includes a fairly comprehensive outline of Microsofts Excel program and a brief mention of the SAS program (is this appropriate in an introductory text?). The last exercise in this chapter contains a very large data file with the annotation that an electronic copy is available from the author, although no author contact address is provided. The publisher has a web page for the book and lists a download site for the books data sets, but when I accessed that site (May 27, 2003), I obtained the message "page not found".
The second chapter returns to a conventional approach to data analysis: graphs and simple statistical descriptions. Le uses the standard, five-mark, tally system despite its known susceptibility to error. He could usefully have introduced his audience to the more reliable Tukey 10-item square tally (4 points, 4 sides, and 2 diagonals). The next chapter deals with probability models, including the use of diagnostic test results. Here the author needs student feedback. The explanations are less helpful than they could be (for example, the 2 x 2 table could be labeled true and false positives, and so on) instead of defining these terms in the text. Bayes theorem is described in terms of conditional probabilities, and the likelihood ratio approach is ignored, although he had previously introduced the concept of odds. The targeted audience is going to use this latter approach to diagnostic tests and should have been introduced to that usage. The description of the kappa statistic is marred by a nomenclature change during the explanation. The remainder of the text covers the standard elementary materials, but includes ROC curves, logistic regression, analysis of survival data (Les interest), and study design.
The subject index is adequate, but there are many typographic errors in both figure legends and labels. At the Arabian Gulf University in Bahrain, we taught biostatistics to our first-year medical students using Essentials of Medical Statistics by Kirkwood and Basic and Clinical Biostatistics by Dawson-Saunders and Trapp. Would I recommend a change to Les Introductory Biostatistics? Not until there is convincing evidence of active student feedback.
|
|
|
References
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |