Clinical Chemistry
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Clinical Chemistry 44: 622-631, 1998;
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Web of Science (22)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Horn, P. S.
Right arrow Articles by Copeland, B. E.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Horn, P. S.
Right arrow Articles by Copeland, B. E.
Related Collections
Right arrow Laboratory Management
(Clinical Chemistry. 1998;44:622-631.)
© 1998 American Association for Clinical Chemistry, Inc.


Laboratory Management

A robust approach to reference interval estimation and evaluation

Paul S. Horn1,a, Amadeo J. Pesce2, and Bradley E. Copeland2

Departments of
1 Mathematical Sciences and
2 Pathology and Laboratory Medicine, University of Cincinnati, Cincinnati, OH 45221.
a Address correspondence to this author at: Department of Mathematical Sciences, University of Cincinnati, PO Box 210025, Cincinnati, OH 45221-0025. Fax 513-556-3417; e-mail paul.horn{at}uc.edu.


   Abstract
Top
Abstract
Introduction
current approaches
robust prediction intervals and...
simulation and assessment
examples
outlying observations
Discussion
References
 
We propose a new methodology for the estimation of reference intervals for data sets with small numbers of observations or for those with substantial numbers of outliers. We propose a prediction interval that uses robust estimates of location and scale. The SAS software can be readily modified to do these calculations. We compared four reference interval procedures (nonparametric, transformed, robust with a nonparametric lower limit, and transformed robust) for sample sizes of 20, 40, 60, 80, 100, and 120 from {chi}2 distributions of 1, 4, 7, and 10 df. {chi}2 distributions were chosen because they simulate the skewness of distributions often found in clinical chemistry populations. We used the root mean square error as the measure of performance and used computer simulation to calculate this measure. The robust estimator showed the best performance for small sample sizes. As the sample size increased, the performance values converged. The robust method for calculating upper reference interval values yields reasonable results. In two examples using real data for haptoglobin and glucose, the robust estimator provides slightly smaller upper reference limits than the other procedures. Lastly, the robust estimator was compared with the other procedures in a population where 5% of the values were multiplied by a factor of 5. The reference intervals were calculated with and without outlier detection. In this case, the robust approach consistently yielded upper reference interval values that were closer to those of the true underlying distributions. We propose that robust statistical analysis can be of great use for determinations of reference intervals from limited or possibly unreliable data.


   Introduction
Top
Abstract
Introduction
current approaches
robust prediction intervals and...
simulation and assessment
examples
outlying observations
Discussion
References
 
The concept of a reference interval in medicine is based on determining a set of values within which some percentage, 95% for example, of the values of a particular analyte in a healthy population would fall. This interval is then used for medical decisionmaking. Recommendations on how to obtain such reference intervals have focused on the types of statistics best used to calculate such a reference interval. These parameters then determine the number of individual specimens required to describe the reference interval with a high degree of confidence. Laboratories are mandated by the College of American Pathologists and the Joint Commission for the Accreditation of Health Organizations and Health Care Finance Administration to determine reference intervals for the populations they serve. Currently, NCCLS guidelines recommend samples of 120 individuals for parametric and 200 individuals for nonparametric interval determination. Very often it is not possible to obtain the suggested number of 120 individuals of a specific group to define a reference interval. In some cases, only 20 or 40 individuals in a particular group may be available for study, which is not an ideal population because of the potential for large errors in the resulting estimates. In our own experience, we found it virtually impossible to obtain sufficient numbers to determine the reference interval for the metabolism of the drug lidocaine into its metabolite methylxylidide because it was difficult to get volunteers who were both healthy and willing to have lidocaine injected into them. In addition, the actual cost per each individual test result was on the order of $100 or more for the test reagents. To use this assay to make life-or-death decisions in liver transplant patients, it was very important that we obtain reliable decisionmaking results with a limited number of test specimens. The question then arises as to the best statistical method to calculate the reference interval when limited sample numbers are available. The purpose of this presentation is to show the usefulness of robust statistical analysis for obtaining a good estimate of reference intervals with a small number of samples. In this presentation, we present the theoretical background for using this approach. We look at cases where 20 samples are available.


   current approaches
Top
Abstract
Introduction
current approaches
robust prediction intervals and...
simulation and assessment
examples
outlying observations
Discussion
References
 
There are two traditional approaches to the derivation of reference intervals. The first is nonparametric and is based on the sample quantiles. For example, if the (central) 90% reference interval is required, then the 5th and 95th sample quantiles are used. A better approach is to use the distribution-free quantile estimators described by Harrel and Davis (1). This estimator is essentially a bootstrapped (2) version of the traditional sample quantile. The estimator of the pth quantile is as follows:

where

where Ix({alpha},ß) is the incomplete beta function and x(1) <= ... <= x(n) are the observed order statistics. The Harrel and Davis estimator has been recommended as the nonparametric method of choice for the derivation of reference intervals (3). Therefore, it is this version of the nonparametric method that will be examined in this study.

The second approach to deriving reference intervals is based on transforming the data to achieve normality, computing the appropriate quantile estimators using normal theory, and back-transforming to the original scale. The transformation used in this study is described by Harris and Boyd (3). Briefly, an initial transformation removes skewness:

This transformation was introduced by Box and Cox (4). Here, the maximum likelihood estimator of {lambda}, , is computed from the original x-data. If ||{lambda}|| < 0.10, then the log(x c) transformation is used, and c, the maximum likelihood estimator of c, is then computed.

Once the initial transformation is fit, a second transformation is derived to remove any remaining kurtosis. The y values in the previous equation are standardized to have zero mean and unit variance. Then a constant, K, is determined so that:

has kurtosis = 0. The power of the transform that is actually used is (K 1)/2 (5). The z-data are then tested for normality by using the Anderson–Darling statistic at significance level 0.15 (6). If the null hypothesis of normality is not rejected, then the traditional normal quantile estimates are used on the z-data, namely:

(1)
where and sz are the sample mean and SD of the z-data, and z(1 - {alpha}/2) is the appropriate standard normal quantile. Therefore, for a 90% reference interval, {alpha} = 0.10, and z(0.95) = 1.645 are used. These two estimates are then back-transformed to the y-data scale and, finally, the original x-data scale. On the other hand, if normality is rejected, then the nonparametric (Harrel and Davis) reference interval is used.

We end this section by noting that the reference interval can be viewed as a prediction interval based on the random sample X1,... ,Xn for the next observation, Xn1. If the underlying population is normal, then the random variable:

has a Student's t-distribution with (n - 1) df. Thus, the appropriate (1 - {alpha}) 100% reference interval is equal to:

(2)
where tn - 1(1 - {alpha}/2) is the appropriate quantile from a Student's t-distribution with (n - 1) df.

Clearly, for large samples the reference intervals defined in Eqs. 1Up , and 2Up are approximately equal. The 90–95% reference intervals defined by Eq. 2Up are ~8% wider than those defined by Eq. 1Up for n = 20 and only ~1% wider for n = 100. However, we will use the reference interval defined by Eq. 1Up on the transformed data, even for small samples, because it is more prevalent in the clinical chemistry literature. We did examine the reference interval based on Eq. 2Up , as well as the interval based on the uniform minimum variance unbiased estimators of the quantiles. Neither of these performed well enough to replace the interval defined by Eq. 1Up for this study.


   robust prediction intervals and quantile estimators
Top
Abstract
Introduction
current approaches
robust prediction intervals and...
simulation and assessment
examples
outlying observations
Discussion
References
 
As noted in the previous section, the (1 - {alpha}) 100% prediction interval for the next observation, Xn 1 given an observed random sample X1 = x1,... ,Xn = xn has the form given by Eq. 2Up . Horn (7) pointed out that Eq. 2Up can be written as:

(3)
In this way, the two components on variation are s, the variance of the unknown observation Xn 1, and s/n, the variance of , the estimated center of the interval.

Horn (7) proposed a prediction interval replacing the three estimates, , s, and s/n, by robust estimates of location and scale. Specifically, the (1 - {alpha}) 100% biweight prediction interval for symmetric populations is defined as follows:

(4)
where Tbi(c1) is the biweight location estimator with tuning constant c1, ST2(c1) is the biweight estimator of the variability of Tbi(c1), and sbi(c2) is the biweight estimator of spread with tuning constant c2 (8). Briefly, Tbi is the solution to the equation:

(5)
where




The term {psi}(u) may be rewritten as {psi}(u) = u ·w(u), where w(·) is a weight function. Making this substitution in Eq. 5Up and solving yields Tbi = {sum}i[xi w(ui)]/{sum}w(ui). Thus, Tbi is a weighted mean with weights that decrease as ui goes from 0 to ±1; equivalently, as an observation xi goes from the center Tbi to Tbi ± c · s, its weight decreases. If an observation is more than c · s from the center Tbi, it gets weight zero. For example, if c = 6 and s = s, the sample SD, then observations >6 SD from the center get weight zero.

Equation 5Up defines a class of different estimators; each estimator is the solution based on a specific {psi} function. For example, if {psi}(u) = u, all observations get equal weight, and the solution is T = . However, in the case of biweight {psi}(·), the solution Tbi is computed iteratively, starting with the sample median. The iteration is necessary because Tbi is a weighted mean with weights that depend on (a previously computed) Tbi.

A popular class of estimators of spread is based on variance estimates of robust estimators of location. For estimators based on Eq. 5Up , the asymptotic variance is simply E({psi})/[E({psi}')], where E(·) denotes mathematical expectation. The variance estimate, S{psi}2, simply replaces mathematical expectation with empirical averaging. For the biweight {psi} function, we use ST2(c) to denote this estimate of Var[Tbi(c)] based on the tuning constant, c. Because this variance estimate is essentially a standard error squared, it goes to 0 by order n. Thus, a reasonable estimate of spread is Sbi(c) = times the square root of variance estimate of Tbi(c).

The actual value used for sbi in the iteration of Tbi is slightly different from that given above. We follow the modified formula given by Kafadar (8). Specifically, sbi is computed by using the biweight function, {psi}(·), but with the sample median used for location and MAD/0.6745 (the median absolute deviation about the median) used as an estimate for scale. (The factor of 0.6745 is included so that MAD/0.6745 is consistent for {varsigma} in the gaussian case.) The sbi used in Eq. 4Up is computed in the same manner; the only difference is the value of the tuning constant. For details, see Horn (7) and Kafadar (8). Simple SAS code, which can be modified for most languages, is included in the Appendix to this report.

The tuning constant c1 is set equal to 3.7, which means that, for the purposes of location estimation, observations are down-weighted (smoothly) the further they lie from the center (i.e., the current value of Tbi in the iteration procedure). Any observations that are more than ~3.7 SD from the center get zero weight. The tuning constant c2, on the other hand, is a function of the value of the prediction interval (1 - {alpha}). Specifically, c2 = [0.58173 - 0.607227(1 - {alpha})]-1 for 0.05 <= {alpha} <= 0.5. Thus, for 90% reference intervals, {alpha} = 0.10 and c2 = 28.4, and for 95% reference intervals, {alpha} = 0.10 and c2 = 205.4 (7).

We intend to examine the performance of this robust prediction interval after the Box–Cox transformation to symmetry. Because it was designed to accommodate possibly heavy-tailed distributions, the power transform to remove any residual kurtosis is not required.

Another candidate for a robust reference interval uses the robust quantile estimator for skewed populations as its upper endpoint (9). This quantile estimator is based on the robust prediction described above. The idea is to examine only data points greater than the sample median. Then a symmetric pseudo-sample is created by including all data points greater than the sample median and their pseudo-values that are equidistant less than the median. For example, if n = 20, and the data are ordered x1 < ... < x20, then the median, M = (x10 x11)/2 and the symmetric pseudo-sample, is:


From this sample, the appropriate symmetric prediction interval is computed as before, and the upper endpoint is used as the upper limit (quantile) on the reference interval. See Horn (9) for details.

The analogous lower robust quantile is not used, because in most cases the underlying populations are positively skewed, and thus the median will be greater than the mode. Reflected pseudo-samples in these cases, although symmetric, will be indicative of underlying bimodal populations (9). Thus, for the lower endpoint, we will use the nonparametric estimator (Harrel and Davis) because it also does not require transformation of the data.


   simulation and assessment
Top
Abstract
Introduction
current approaches
robust prediction intervals and...
simulation and assessment
examples
outlying observations
Discussion
References
 
To evaluate the four reference interval procedures (nonparametric, transformed, robust with nonparametric lower limit, and transformed robust), a simulation study was run. Random samples of size 20, 40, 60, 80, 100, and 120 were generated from each of four {chi} distributions with df 1, 4, 7, and 10, respectively. The usual measure of performance is the root mean square error (RMSE) for each of the endpoints that constitute the reference interval. Specifically, the RMSE of the upper endpoints of a particular (1 - {alpha}) 100% reference interval, for example, is as follows:

where -1(1 - {alpha}/2) is the estimate of the true endpoint (quantile) F-1(1 - {alpha}/2). This value is estimated via simulation by:

where i-1 (1 - {alpha}/2) is the endpoint of the reference range derived from the ith random sample, and N is the number of random samples in the simulation; here N = 1000.

The RMSEs of the lower and upper endpoints of 90% and 95% reference intervals are given in Tables 1 and 2, respectively. For the upper endpoint of 90% reference intervals, the robust quantile estimator (untransformed) achieves the smallest RMSE, especially for smaller sample sizes and the more skewed populations (fewer df). However, it is essentially equal in performance to the transformed traditional procedure for n >= 40. For the lower endpoint, the RMSEs of the two transformed procedures (traditional and robust) are about equal and slightly better than the RMSE of the nonparametric, which is also used by the untransformed robust procedure. For 95% reference intervals, however, the robust procedure is clearly best, especially for the smaller sample sizes (n <= 40). For the larger sample sizes, the transformed procedures again are about equal, with the robust slightly better for the more skewed populations and the traditional (normal theory) procedure slightly better for the more symmetric populations (more df). For the lower endpoints, the transformed procedures again are about equal and only slightly (5–10%) better than the nonparametric procedure.

Traditionally, assessment of reference interval limits has focused on the RMSE of the interval endpoints as described above. This certainly makes sense if the interval endpoints are used as targets for treatment. For example, suppose the endpoint of the 95% reference interval for creatine kinase for middle-aged women is 192 U/L, as derived by a particular laboratory. Physicians who use this laboratory may evaluate their patients who have concentrations in excess of this value to determine the cause. In this case, clearly, the value of the endpoint of the reference interval itself is vital, and its accuracy (RMSE) is vital for assessment of a procedure.

On the other hand, the reference interval is designed to include (or exclude) a specified percentage of the underlying population. It could be argued that, in fact, it is this percentage that should be evaluated. Specifically, we will now consider the RMSE of the percentage as estimated by the lower and upper endpoints of the reference interval. Here, the RMSE of the upper probabilities, for example, is as follows:

which is estimated from the simulation by:

where F[i-1(1 - {alpha}/2)] is the actual (unknown, in practice) proportion of the population less than the upper limit of the reference interval from the ith simulated random sample.

The RMSE for the lower and upper probabilities of 90% and 95% reference interval limits are not presented here because all procedures achieved roughly the same RMSE, although that of the robust upper probability limit was slightly smaller for n = 20. One particularly interesting fact is that the transformed robust procedure, which appeared to perform poorly (especially for small samples) as an upper endpoint estimator, should do so well in terms of the RMSE of the probability. This phenomenon may be explained by the first few terms of the Taylor expansion of the mean square error (MSE) of the probability. Specifically, if we expand MSE {F[-1(p)]} (p = 1 - {alpha}/2 for brevity) about the true quantile F-1, we get the following:

where f(·) = F'(·), the underlying population density.

If we examine only the first (nonzero) term of the Taylor expansion, we see that the MSE (and thus the RMSE) for the probability is proportional to that of its corresponding endpoint. However, the second term shows that the upper limit estimators, which are positively skewed with respect to the true quantile, will benefit in terms of MSE for the probability. This is because, in general, f'[F-1(p)] < 0 for the upper limits. (These results are not surprising because the probability contained between an upper quantile and a one-unit shift to the right is less than that of a one-unit shift to the left.)

Although rewarding upper limits that tend to be skewed toward larger values may be the conservative thing to do statistically (e.g., if we state "p% < x", we want at least p%), it could be disastrous in the context of a medical reference interval, where large values of the analyte in question are indicative of a possibly adverse health condition. To equalize the MSE loss, we introduce a factor to be multiplied by the difference {F[i-1(1 - {alpha}/2)] - (1 - {alpha}/2)} before squaring for those samples where [i-1(1 - {alpha}/2)] > 1 - {alpha}/2. We will use as this factor the ratio of probabilities to the left and right of the upper quantile. Thus, for the upper limit, this factor on the true difference in probabilities will be (1 - {alpha}/2)/({alpha}/2), when the probability contained by the upper limit of the reference interval exceeds the nominal, target value. The same factor is used for lower limits when their true probabilities are less than the target value.

The weighted RMSEs of the probabilities for 90% reference intervals, where the above factors premultiply the differences before the squaring operation, are presented in Table 3 . Essentially, all of the procedures are equivalent, with a slight edge going to the robust approach for the most skewed population and to the transformed traditional approach for the others. One fact to note is that the transformed robust is worst for n = 20, as it was for the interval endpoints. The results for 95% reference intervals are provided in Table 4 . In this situation, however, the robust procedure for the upper limit is clearly superior in every case. Of particular interest is that the robust method does very well compared with the other methods for larger sample sizes. This indicates that the robust upper limit is a reasonable procedure for large as well as small samples.


View this table:
[in this window]
[in a new window]
 
Table 3. Weighted RMSE of 90% reference interval: upper and lower probabilities.


View this table:
[in this window]
[in a new window]
 
Table 4. Weighted RMSE of 95% reference interval: upper and lower probabilities


   examples
Top
Abstract
Introduction
current approaches
robust prediction intervals and...
simulation and assessment
examples
outlying observations
Discussion
References
 
As a first example, we examine the haptoglobin data as given in Harris and Boyd (3). For these 100 values, the 95% reference intervals (i.e., the 2.5 and 97.5 percentiles) were computed. For each point estimator of a percentile, a 90% confidence interval is provided. The confidence interval for the transformed procedure made use of the formula, (percentile estimate) ±u(1 ß)/2[(2 c12 - {alpha})sx2/2N]1/2, where sx is the sample SD of the transformed data, N is the sample size, {alpha} defines the quantiles of interest, and ß defines the confidence level of the interval for each of the point estimators (10). In our case, {alpha} = 0.025 and ß = 0.90.

The confidence intervals for the other methods were derived by using the bootstrap methodology (2). Here, 200 samples were drawn with replacement (i.e., resampled from the observed data), yielding 200 reference intervals for each methodology. From these values, the observed 5th and 95th quantiles were used as a 90% confidence interval.

The results for the haptoglobin data are given in the top of Table 5 . All of the methods are reasonably consistent. The transformed methods have a lower quantile estimator about two units larger than that of the nonparametric. The confidence interval for the upper endpoint based on the bootstrapped robust method is ~1% tighter than that based on the transformation approach.


View this table:
[in this window]
[in a new window]
 
Table 5. 95% reference interval endpoints (with 90% confidence intervals).

As a second example, we compute similar statistics for blood glucose concentrations (mmol/L) in samples obtained in our laboratory from 46 men, >=80 years of age. The data are as follows:

3.520 3.905 4.070 4.070 4.290 4.345 4.400 4.455 4.565

4.620 4.620 4.675 4.840 4.840 4.895 4.895 4.950 4.950

5.115 5.115 5.225 5.225 5.225 5.335 5.335 5.390 5.390

5.390 5.455 5.555 5.610 5.665 5.720 5.775 5.830 5.830

5.885 5.885 6.215 7.095 7.205 8.140 9.900 10.890 11.605

12.045

The results are given in the bottom of Table 5Up . We note that, in this case, no suitable transformation to normality was found; therefore, only the nonparametric and robust procedures appear. From Table 5Up , we see again that the upper quantile estimator provided by the robust procedure is tighter than that of the nonparametric. Note that the confidence intervals of both upper quantile estimators lie entirely in the range defined as diabetic (>7.7 mmol/L or 1.4 g/L) by the American Diabetic Association (11).


   outlying observations
Top
Abstract
Introduction
current approaches
robust prediction intervals and...
simulation and assessment
examples
outlying observations
Discussion
References
 
Until now, we have assumed that all of the data come from a homogeneous population and that any large aberrant values are also part of that population. However, in practice, real data are subject to contamination from a variety of sources, such as human error or the presence of disease in an individual. Simulation results for the upper limit of 90% and 95% reference intervals in the presence of outliers are given in Table 6 . In this case, the outliers comprise 5% of the sample, and they are derived by multiplying a valid observation by a factor of 5. Except for a few isolated cases, the robust methods are best with respect to RMSE for the upper interval endpoint; the transformed robust method is slightly better for n >= 60. All of the methods "broke down", however, in the sense that their RMSEs were at least an order of magnitude larger than those with uncontaminated data (Tables 1 and 2 ). Nevertheless, the robust methods were more resistant. (Note that, although the RMSEs generally decrease as the sample size increases, the decrease is not exactly monotone, as was the case without outliers. This is because the contamination of the samples introduced more noise to the simulation.)


View this table:
[in this window]
[in a new window]
 
Table 6. RMSE of upper 90% and 95% reference interval endpoints: 5% outliers (x5).


View this table:
[in this window]
[in a new window]
 
Table 1. RMSE of 90% reference interval endpoints.


View this table:
[in this window]
[in a new window]
 
Table 2. RMSE of 95% reference interval endpoints.

The use of outlier detection is not routinely recommended for reference interval analysis because large values from a skewed population may be mislabeled as outliers (11). However, for completeness, Table 7 presents results based on the same data as Table 6Up but with a simple outlier detection method on the original data; any value >3.5 SD away from the mean is ignored. One thing is clear from Table 7 —the drastic improvement of all the methods. Nevertheless, the robust method maintains its superiority in virtually every situation.


View this table:
[in this window]
[in a new window]
 
Table 7. RMSE of upper 90% and 95% reference interval endpoints: 5% outliers (x5)—with outlier detection.

Although not presented here, results for the weighted RMSE of the upper probabilities do not contradict the above results. Without outlier detection, all methods are essentially the same, with the robust method having a slight advantage for the most skewed population. With outlier detection, all methods improve, but the robust method becomes clearly the best in every situation.


   Discussion
Top
Abstract
Introduction
current approaches
robust prediction intervals and...
simulation and assessment
examples
outlying observations
Discussion
References
 
The need to derive reference ranges from samples where the number of observed data values is small, for example, 20 <= n <= 60, clearly exists. We show by the simulation study presented here that the RMSE calculated by the robust quantile estimator was the smallest for upper endpoints calculated on small sample sizes. However, when evaluating the upper and lower probabilities for the 90% and 95% reference intervals, we showed that the losses in over- vs underestimating should not be treated symmetrically. To equalize the MSE loss, a weighting factor was introduced. In this case, the robust statistic was superior for estimating the upper limit of the 95% reference interval and about equal to the transformed traditional interval for estimating the 90% reference interval. When real serum haptoglobin data were examined in this fashion, the robust estimator of the 97.5 percentile limit was smaller than that of the nonparametric estimator and comparable with the estimator based on transformation. A second example, using glucose data from 46 elderly men, showed a similar result. Thus, it is reasonable to propose that robust estimators can provide relevant reference intervals when only small numbers of samples are available. Furthermore, if it is suspected that outliers may exist, then the robust method should do as well as, if not better than, other methods, whether or not outlier detection is used. However, because none of the procedures did particularly well when confronted with severe contamination, we cannot overstate the importance of ensuring the quality of the data and the data collection process when determining reference intervals.

In summation, we recommend that nonparametric, robust, and normal theory (on transformed data) reference intervals be computed in practice. If the methods are in agreement, then any one will do reasonably well for reporting purposes. However, if the methods disagree, then we believe that the tightest interval should be used. The reason we recommend this is that, given the choice between reasonable, though disparate, reference intervals, we would prefer to err on the side of more false positives, rather than false negatives, thus forcing the clinician to further evaluate the patient. Finally, if the sample size is so small that it precludes reasonable nonparametric confidence intervals for the limits, or if a suitable transformation to achieve normality is not possible, then the proposed robust method should be used, at least for the upper endpoint of the reference interval.


   References
Top
Abstract
Introduction
current approaches
robust prediction intervals and...
simulation and assessment
examples
outlying observations
Discussion
References
 

  1. Harrel FE, Davis CE. A new distribution free quantile estimator. Biometrika 1982;69:635-670. [Abstract/Free Full Text]
  2. Efron B. The jackknife, the bootstrap, and other resampling plans. CBMS-NSF regional conference series in applied mathematics 1982:29-36 Society for Industrial and Applied Mathematics Philadelphia. .
  3. Harris EK, Boyd JC. Statistical bases of reference values in laboratory medicine 1995:1-61 Marcel Dekker New York. .
  4. Box GEP, Cox DR. An analysis of transformations. J R Stat Soc 1964;B26:211-252.
  5. Boyd JC, Lacher DA. A multi-stage gaussian transformation algorithm for clinical laboratory data. Clin Chem 1982;28:1735-1741. [Abstract/Free Full Text]
  6. Linnet K. Two-stage transformation systems for normalization of reference distributions evaluated. Clin Chem 1987;33:381-386. [Abstract/Free Full Text]
  7. Horn PS. A biweight prediction interval for random samples. J Am Stat Assoc 1988;83:249-256.
  8. Kafadar K. A biweight approach to the one-sample problem. J Am Stat Assoc 1982;77:416-424.
  9. Horn PS. Robust quantile estimators for skewed populations. Biometrika 1990;77:631-636. [Abstract/Free Full Text]
  10. . International Federation of Clinical Chemistry. Approved recommendation (1987) on the theory of reference values. Part 5. Statistical treatment of collected reference values. Determination of reference limits. J Clin Chem Clin Biochem 1987;25:645-656. [Web of Science]
  11. American diabetes data group classifications and diagnosis of diabetes and other categories of glucose intolerance. Diabetes 1979;28:1039–57..



The following articles in journals at HighWire Press have cited this article:


Home page
Clin. Chem.Home page
E. Schwedhelm, V. Xanthakis, R. Maas, L. M. Sullivan, F. Schulze, U. Riederer, R. A. Benndorf, R. H. Boger, and R. S. Vasan
Asymmetric Dimethylarginine Reference Intervals Determined with Liquid Chromatography-Tandem Mass Spectrometry: Results from the Framingham Offspring Cohort
Clin. Chem., August 1, 2009; 55(8): 1539 - 1545.
[Abstract] [Full Text] [PDF]


Home page
Ann Clin BiochemHome page
F. Ceriotti, R. Hinzmann, and M. Panteghini
Reference intervals: the way forward
Ann Clin Biochem, January 1, 2009; 46(1): 8 - 17.
[Abstract] [Full Text] [PDF]


Home page
Med Decis MakingHome page
J. Jund, M. Rabilloud, M. Wallon, and R. Ecochard
Methods to Estimate the Optimal Threshold for Normally or Log-Normally Distributed Biological Tests
Med Decis Making, July 1, 2005; 25(4): 406 - 415.
[Abstract] [PDF]


Home page
J. Nutr.Home page
H. M. Blanck, B. A. Bowman, G. R. Cooper, G. L. Myers, and D. T. Miller
Laboratory Issues: Use of Nutritional Biomarkers
J. Nutr., March 1, 2003; 133(3): 888S - 894.
[Abstract] [Full Text] [PDF]


Home page
Clin. Chem.Home page
P. S. Horn and A. J. Pesce
Effect of Ethnicity on Reference Intervals
Clin. Chem., October 1, 2002; 48(10): 1802 - 1804.
[Full Text] [PDF]


Home page
Clin. Chem.Home page
P. S. Horn, L. Feng, Y. Li, and A. J. Pesce
Effect of Outliers and Nonhealthy Individuals on Reference Interval Estimation
Clin. Chem., December 1, 2001; 47(12): 2137 - 2145.
[Abstract] [Full Text] [PDF]


Home page
Clin. Chem.Home page
P. S. Horn, A. J. Pesce, and B. E. Copeland
Reference Interval Computation Using Robust vs Parametric and Nonparametric Analyses
Clin. Chem., December 1, 1999; 45(12): 2284 - 2285.
[Full Text] [PDF]


Home page
Stat Methods Med ResHome page
E. M Wright and P. Royston
Calculating reference intervals for laboratory measurements
Statistical Methods in Medical Research, April 1, 1999; 8(2): 93 - 112.
[Abstract] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Web of Science (22)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Horn, P. S.
Right arrow Articles by Copeland, B. E.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Horn, P. S.
Right arrow Articles by Copeland, B. E.
Related Collections
Right arrow Laboratory Management


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS