Clinical Chemistry Link to Randox Laboratories Web Site
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Clinical Chemistry 50: 907-914, 2004. First published March 11, 2004; 10.1373/clinchem.2003.023770
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Data Supplement
Right arrow All Versions of this Article:
clinchem.2003.023770v1
50/5/907    most recent
Right arrow Submit an electronic Letter to
the Editor about this paper
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via ISI Web of Science (3)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Griffiths, J. K.
Right arrow Articles by Nix, A. B.J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Griffiths, J. K.
Right arrow Articles by Nix, A. B.J.
Related Collections
Right arrow Laboratory Management
Right arrow General Clinical Chemistry
(Clinical Chemistry. 2004;50:907-914.)
© 2004 American Association for Clinical Chemistry, Inc.


Laboratory Management

Centile Charts II: Alternative Nonparametric Approach for Establishing Time-Specific Reference Centiles and Assessment of the Sample Size Required

Jenny K. Griffiths1, Terence C. Iles2,a, Martin Koduah2 and Arthur B.J. Nix1

1 Department of Epidemiology, Statistics and Public Health, University of Wales College of Medicine, Heath Park, Cardiff, United Kingdom.2 School of Mathematics, Cardiff University, Senghennydd Road, PO Box 926, Cardiff CF24 4YH, United Kingdom.

aAuthor for correspondence. Fax 44-29-2087-4199; e-mail iles{at}cardiff.ac.uk.


   Abstract
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
Background: Reference intervals, and more generally centile estimates, are used to characterize a reference population for the purposes of interpreting an individual patient’s clinical measurement. We describe methods of calculating reference intervals where these centiles vary with a covariate, usually age or time.

Methods: The US Food and Drug Administration and the IFCC have made recommendations on two approaches: the parametric approach, which models the structural characteristics of the data set with a theoretical distribution, and the nonparametric approach, which makes no particular assumption about this structure. In this report we propose a nonparametric procedure that relies on the principles of regression and show how sample size determination can be assessed. We also show how the sample size calculation is influenced by the distribution of the times measured.

Results: We illustrated our method on three data sets and compared the results for our proposed nonparametric method with parametric estimates. We showed that the bias is reduced and that the nonparametric method is less likely to produce fluctuating profiles.

Conclusions: To achieve adequate precision the sample size needs to be larger than 120, as has often been recommended. If there is doubt about the parametric model, then threshold sample sizes may need to be as high as 500.


   Introduction
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
Reference intervals, and more generally centile charts, are commonly used to characterize the reference population, i.e., those people who are free of any disease and representative of the population. Problems associated with the selection of individuals to characterize a reference population have been well documented by many authors (1). It is not our purpose to consider this issue any further, but we will assume that such a population has been identified in an appropriate way.

The US Food and Drug Administration and IFCC (2)(3) have defined the reference interval as the range of values that lie between the 2.5th and 97.5th percentiles of the reference population. Often these centiles vary with a covariate, usually age or time, and an individual must then be classified at the age or time point relevant to that individual. Where this is done results are referred to a range of reference centiles as time progresses, and it is not uncommon to see some or all percentiles (e.g., 3, 5, 10, 25, 50, 75, 90, 95, and 97) identified for reference purposes. Because we focus primarily on methodology, we will restrict our investigations to identification of the 2.5th and 97.5th percentiles, unless otherwise stated.

Currently there are two generic approaches for establishing time-specific reference intervals. The parametric approach models the time-specific distributional properties of the response variable. Usually the mean and SD of the response are modeled, either directly or after a suitable transformation of the data has been undertaken. Nonparametric methods, however, make few or no assumptions about the underlying distributional form of the data. The most influential reports on these approaches have been published by authors such as Royston and Wright (4)(5)(6)(7), Cole and Green(8), Rossiter (9), Healy et al.(10), and Tango (11). The approach of Royston and Wright(4)(5)(6)(7) is a parametric regression-based method and has attracted much interest, Rossiter’s (9) and Tango’s(11) approaches are fully nonparametric methods, whereas the approaches of Cole and Green (8) and Healy et al.(10) are hybrid methods that assume some parametric structure but add nonparametric smoothing. In this report we highlight some deficiencies associated with current nonparametric methods and propose an alternative approach that produces centile charts with more desirable characteristics, even when compared with parametric approaches. Our method is very similar in structure to the parametric approach of Royston and Wright (4)(5)(6)(7), and we compare our new method with this approach, using one of the data sets investigated.

There have been diagnostic tests that have assessed the goodness of fit for a particular data set [see, for example Royston and Wright, (12)], but these do not enable a comparison of methods. For example, it is not possible from a particular data set to assess the effect of sampling error, i.e., the error that derives from the natural variation between different samples. However, appropriately modeled simulation studies can give a broad assessment of the effectiveness of the statistical methods used and hence give an indication of the threshold sample size required. We perform such simulations and conclude that rather large sample sizes are required to produce reliable estimates of the reference interval when either a parametric or a nonparametric procedure is used.

The impact of the sampling plan associated with the time variable is also investigated. We conclude that a nonuniform distribution of times can give estimated reference intervals that are less effective in terms both of bias and precision for extreme times.


   Materials and Methods
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
motivating examples
In recent years concern has increased with regard to high cholesterol concentrations because high cholesterol is a known risk factor for coronary heart disease. Cholesterol concentrations are known to be gender and age dependent. For the purposes of this report we were able to obtain a data set of 883 cholesterol readings from females who participated in a cross-sectional study in Wales (UK), the study known as Heart Beat Wales. These data were thought to be a representative sample that could be used to establish a centile chart. Because we are only using this data set to illustrate our method, we assume that all of the caveats raised earlier are satisfied. The data are shown in Fig. 1A , together with certain centile plots that will be explained later.



View larger version (35K):
[in this window]
[in a new window]
 
Figure 1. Results for female cholesterol.

(A), female cholesterol plotted against age. Plotted are the 2.5th and 97.5th percentiles using Royston’s method (solid lines) and our proposed nonparametric method (dashed lines). (B), scatter plot of the absolute values of residuals from the quadratic regression for female cholesterol on age. The corresponding nonparametric Lowess fit (solid line) is superimposed for comparison.

Our second data set refers to a survey on newborns that was undertaken at the Princess Mary Maternity Hospital, Newcastle-upon-Tyne, during 1986 by Scott et al. (13), with the aim of discovering whether early postnatal scanning might show any evidence of maldevelopment of the urinary tract. The data can be seen in Fig. 2 , together with certain centile plots. Essentially, after the data were cleaned, there were 543 recordings of kidney length and birthweight. The exercise was intended to indicate whether a kidney was of an abnormal size compared with the baby’s birthweight. More information on the study can be found in Scott et al. (13) and Rossiter (9).



View larger version (19K):
[in this window]
[in a new window]
 
Figure 2. Logarithm of the maximum kidney size plotted against birthweight.

The lines indicate the 3rd, 10th, 25th, 75th, 90th, and 97th percentiles obtained with our proposed nonparametric method.

Our third data set (Fig. 3A ) shows the relationship between the logarithm of maternal serum {alpha}-fetoprotein (AFP) concentrations and gestational age. AFP is known to vary with fetal gestational age as well as with the health status of the fetus; therefore, the measurement aids in the screening process for certain chromosomal anomalies.



View larger version (34K):
[in this window]
[in a new window]
 
Figure 3. Results for AFP.

(A), scatter plot of log(AFP) against gestational age. The least-squares regression line (solid line) is superimposed. (B), scatter plot of the absolute values of residuals from the regression of log(AFP) on gestational age. The corresponding nonparametric Lowess fit (solid line) is superimposed for comparison.

nonparametric method
It is convenient and informative in our development of a nonparametric approach for establishing time-specific reference intervals to highlight first the main steps of the parametric approach proposed by Royston (4). Assuming that the data had been "cleaned", Royston proposed the following method, later refined by Altman (14). In this report we refer to this method as the Royston–Altman approach because these were the authors of the original reports. However, Royston and Altman (15) subsequently developed the idea of fractional polynomials, and these were applied to reference centiles by Royston and Wright (16). We have not investigated these refinements in this report. The Royston–Altman approach is as follows:

(i) The regression function, which is the estimated mean response as a function of time, is identified by use of a modified backward stepwise selection procedure. This procedure fits polynomial models up to a cubic. Paired comparisons of cubic against quadratic, quadratic against linear, and linear against a constant are made in that order. If the cubic is not better than the quadratic, then the quadratic is compared with the linear, and so forth, the comparison criterion being based on an appropriate P value. At the end of the procedure, a simple polynomial model describing the mean response is determined.

(ii) The absolute residuals are obtained from the fitted model in (i), and the regression function for these absolute residuals identified. This was suggested by Altman (14) as a means of identifying the time dependence of the SD. He suggested regression models for this regression that include terms up to the cubic.

(iii) The standardized residuals are the residuals from (i) divided by the corresponding SD determined in (ii). These are combined and tested to determine whether they can be assumed to come from a gaussian distribution.

(iv) If a gaussian distribution is not achieved, the response data y are transformed using the transformation ln(y + c) for some c, and the above steps repeated.

After the process converges, the standardized residuals are assumed to be gaussian, and the percentiles are calculated in the usual way (4). It should be noted that a further refinement would be to use a weighted regression in (i) with the SD profile from (ii). This would need another iterative procedure over and above that attributable to the transformation, although very few iterations would be required for most SD profiles.

The key features of the above approach are to identify the mean response and scale parameter (SD) by regression and finally to identify the appropriate percentiles from the combined set of standardized residuals. To justify combining the standardized residuals requires the assumption that the probability densities describing the data patterns at different times differ only by scale and shift. These key features are sensible if it is reasonable to assume that there is no measurement error in the "time" variable, and as such should be embodied in a nonparametric approach.

proposed nonparametric method
To parallel the parametric method of Royston (4) we propose the following nonparametric approach to determine a reference interval:

(i) A locally linear nonparametric regression is used, with smoothing function hbase (described later), to determine the time-specific mean response.

(ii) From the fitted mean response in (i) the absolute residuals are determined, and another locally linear regression is performed with smoothing function hSD.

(iii) Using the scale parameter obtained in (ii), the standardized residuals are obtained from the model fitted in (i) by division, in the same way as Royston’s approach (4).

(iv) From the combined set of standardized residuals the appropriate centiles are estimated directly from the order statistics.

(v) Having identified the centiles, the process is reversed to establish the centile plots in the original scale of measurement.

Local linear regression is a nonparametric technique that uses a "moving window" approach in which, for each window, a weighted least-squares line is fitted, and the predicted value at the center of the window is determined. The smoothing function effectively determines the width of this window. The profile of the predicted values with time defines the nonparametric fit. The various steps that are needed to implement this method are explained in full in Bowman and Azzalini (17). The above procedure provides a method that has two smoothing functions, hbase and hSD. These are often taken to be of constant value, that is, independent of time. The quality of the fitted regression models depends on their departure from the true model. The integrated, or totaled, difference between the true model and the fitted model is referred to as the integrated square error. For a good quality of fit the average, or mean, of the integrated square error should be small. Mathematically one could define criteria, such as minimizing the mean integrated square error (17), that might determine optimum values for these functions. However, such a "black box" approach desensitizes the procedure and does not allow clinical input, a point returned to later. In addition, if the pattern of data points is nonuniform in the time domain and/or there is a time dependency in the SD of the response, then adaptive techniques may need to be considered. Here the value of the smoothing functions hbase and hSD are explicitly time dependent. Note that a kernel density estimate should not be used to identify the centiles in step (iv) because it will produce biased estimates. For further details see the Technical Appendix (available as a Data Supplement accompanying the online version of this article athttp://www.clinchem.org/content/vol50/issue5/ ).

simulation study and method assessment
As mentioned earlier, it is generally the case that the true response model for a given application is unknown, hence the plethora of methods to establish reference intervals. We propose the following simulation strategy to compare the properties of estimated parametric and nonparametric reference intervals from any set of clinical data. First, a parametric model that adequately produces a scatter graph with similar time dependence of the response variable in terms of location and scale should be determined. This is a task that is not difficult to accomplish because many biological marker values vary only slowly with time. Simulation techniques are then used to compare the statistical methods under investigation. We illustrate this procedure with the female cholesterol plotted against age (Fig. 1AUp ) and the log(maternal serum AFP) plotted against gestational age (Fig. 3AUp ). The literature [see, for example, Cuckle et al. (18)] indicates that log(AFP) varies linearly with gestational age and has a constant SD over time. For female cholesterol there is clearly a shallow curve to the mean response with an indication that the SD about this line increases with age.

Plots of the absolute residuals against time are shown in Figs. 1BUp and 3BUp and confirm near constant SD for the log(AFP) data and a SD for the female cholesterol data that increases approximately linearly with time. It should be noted that Figs. 1BUp and 3BUp are of raw residuals. For their use in the parametric method investigated in this report, they should be converted to SD units by multiplication of as described by Altman (14).

In this report we discuss the results of four simulated models based on these two studies. The parametric forms of the models identified by an initial parametric analysis of the data are shown in Table 1 as models 1 (for AFP) and 2 (for cholesterol). Models 3 and 4 are based on the data for model 1 but consider different sampling plans and error distributions, as indicated in Table 1 . These modifications are discussed further in the Results.


View this table:
[in this window]
[in a new window]
 
Table 1. Mean response and SD profiles for simulation models 1–4 described in the text.

The quality of fit is assessed in this report by a percentile inclusion probability (PIP) introduced by Koduah et al. (19). This entails determining the reference limits for each simulated data set. Using these estimated limits, we determine the distribution of the reference limits at each of 50 equally spaced time values spanning the range of the data. We next compare the results with the true percentiles from the theoretical model. We then evaluate the proportion of times the estimated percentiles fall in the range (0.95–0.99) of the true percentiles for the upper reference limit and the range (0.01–0.05) for the lower reference limit. Koduah et al. (19) suggested that these PIPs should be at least 0.95 for the method to be classified as adequate. We have used this criterion for assessment rather than the ratio suggested by Linnet (20) because the PIP relates more directly to the use that is made of a reference interval. Moreover, Linnet’s ratio criterion has been shown by Koduah et al. (19) to be deficient with respect to biased estimators. PIP values can also be used to give an indication of threshold sample size. By plotting these probabilities against sample size, threshold sample sizes can be determined beyond which acceptable PIPs are achieved.


   Results
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
female cholesterol data
In her analysis of this data, Griffiths (21) used the Royston–Altman method described earlier. This provides the opportunity of a comparison with our proposed nonparametric method. Her derived model was as follows:

for the regression of mean response with a constant SD of 0.1659 in this domain.

Minimizing the mean integrated square error produced constant optimal smoothing function values of ~6 years for both hbase and hSD. We feel that these optimum smoothing values should be considered only a guide. Their values should be modified if by doing so features emerge that have clinical relevance. For this particular data set we had no local clinical guidance; we therefore used the value of 6 years for both hbase and hSD. The resulting estimated 2.5th and 97.5th centiles are shown for both approaches in Fig. 1AUp . As can be seen, the centile plots are very similar, particularly so for the 2.5th centile. There is more of a difference in the 97.5th centile plot, with the nonparametric approach indicating a slight shoulder effect in the late teens/early twenties and again in the late forties/early fifties. There is some clinical evidence that might support the observed patterns. Hall et al. (22) report an association between increased cholesterol and hormonal changes, particularly for those women taking a contraceptive pill and for those receiving hormone replacement therapy. It seems reasonable to conclude, therefore, that the shape of the nonparametric 97.5th centile curve may relate to an underlying structure present in the data rather than an under smoothed centile plot.

A final observation is that Griffith’s (21) implementation of Royston’s parametric approach (4) had 22 values below the 2.5th centile and 29 above the 97.5th centile. With 883 observations one would expect 22 in each region; there thus are more than is expected above the 97.5th centile. By construction our method returns the correct proportion for each region. It is true, however, that there is no inherent control over the temporal pattern associated with these points. If the distribution over time of such points is problematic, then further adjustment of the value of hSD will help, again emphasizing the need to have flexibility in the final choice for hbase and hSD.

kidney data
These data provide the opportunity to compare our proposed nonparametric method with the nonparametric approach proposed by Rossiter (9). Rossiter’s method fits a bivariate kernel density with a logistic kernel. This enables the conditional distribution to be identified for each time point, and the appropriate time specific percentiles can then be determined. Rossiter’s estimated centile curves, together with the raw data, are shown in her Fig. 1Up . It is clear from the scatter plot in our Fig. 2Up that the data are not uniformly distributed over time and that an adaptive technique should therefore be considered. The smoothing function here has the value of 600 g throughout the central range of the data, increasing to 1000 g at the extremes. The centile plots are approximately linear and parallel throughout the main body of the data, but they display slightly more curvature in the extremes. Notwithstanding this curvature in the extremes, they exhibit much less tendency to fluctuate over the range of the data than do Rossiter’s centile plots.

Close inspection of our Fig. 2Up with Rossiter’s Fig. 1Up (9) also indicates that her 3rd and 97th centile plots lie outside of our 3rd and 97th plots, supporting the point made by Bowman and Azzalini (23) that bias in the estimation of extreme percentiles is to be expected when nonparametric kernel density methods are used. For this data set there are 543 values; one would therefore expect 16 values below the 3rd percentile and above the 97th percentile. By construction this is achieved by our method, but for Rossiter’s (9) approach the values are 8 and 10, respectively, indicating that her estimated centiles are too extreme.

afp data
To illustrate the determination of sample size and assessment of the method of analysis, we consider the AFP data. Here only the upper 97.5th reference limit is discussed because the results and conclusions for the lower 2.5th reference limit are virtually identical. The PIP was assessed on 1000 simulations. Plots of the PIPs for models 1 and 2, using both statistical methods, are shown in Figs. 4 and 5 for the three sample sizes of 100, 300, and 500 (bottom to top, respectively). As expected, the PIPs increase with sample size for both methods. The Royston–Altman method (dashed lines in Figs. 4 and 5 ) clearly outperforms the nonparametric approach, although a good PIP value is obtained for both methods for sample sizes of 500. Clearly, for either method sample sizes <300 are inadequate. One general feature of each plot is the decrease in the PIP value toward the extremes of the reference interval. This is to be expected because prediction at the extremes of the reference range is less precise, although it does raise the question of the range of values of the time variable over which the reference range should be used.



View larger version (13K):
[in this window]
[in a new window]
 
Figure 4. PIP plotted against gestational age for the log(AFP) data using model 1.

Solid lines represent the nonparametric method; dashed lines represent the Royston–Altman parametric method. Sample sizes used are 100, 300, and 500.



View larger version (14K):
[in this window]
[in a new window]
 
Figure 5. PIP plotted against age (years) for the female cholesterol data using model 2.

Solid lines represent the nonparametric method; dashed lines represent the Royston–Altman parametric method. Sample sizes used are (bottom to top) 100, 300, and 500.

For model 2, other features emerge. In Fig. 5Up it is clear that the PIP plot has two peaks for the Royston–Altman method. The reason for this pattern is that on occasions a linear model is selected, and this usually intersects with the true quadratic model at two points, which are offset from the center but within the range of the data. At these points of intersection the bias is zero and hence the PIP is larger. It is also evident in these plots that the Royston–Altman method has poorer PIP properties than when their approach is applied to the linear mean response models. The picture for the nonparametric approach is similar, although as a consequence of its model-independent nature, the characteristics of the PIP profiles are similar whichever model is used. With the more complex model it is clear that the PIP is only in excess of 0.95, the nominal value, for the central 80% of the reference interval. Again this raises serious questions about the time span over which such reference intervals should be used even with sample sizes of 500.

To this point we have only considered the PIP properties of reference intervals when the data have been gathered in such a way as to produce a uniform distribution of results over the age covariate. This pattern of data collection is typical for planned studies. However, reference intervals are also calculated from data that are routinely collected, such as data obtained from a screening program, where the distribution of data values reflects that of the population. The AFP case considered here is an illustration. It can be seen that most women presented for screening in weeks 15 and 16 of gestation rather than at later gestational ages.

We investigated the impact of this truncation on the construction of reference ranges for such data patterns by simulating model 1 again but with a truncated gaussian distribution of gestational ages. The mean was chosen to be 111 days, the observed mode, with the SD set at 5.5 days. This is referred to as model 3 in Table 1Up . This assumed distribution of ages reproduces the data pattern reasonably well. As expected there is a substantial decrease in performance for later gestational ages, reflecting the scarcity of data in this region. For a sample size of 500, the PIP profiles for the parametric approach are similar in shape to those in Fig. 5Up but with a decrease from 0.97 to 0.85 at a gestational age of 125 days. The nonparametric profile is also similar in shape but with a value on the extreme left of 0.75, a plateau of 0.9, and a value on the extreme right of 0.42.

It is clear that the adequacy of an estimated reference interval depends on both the sample size used and the distribution of the data with respect to the time covariate. The simulation models we have considered all lie within the family of models used by the Royston–Altman method. As a result it is to be expected that this parametric approach would have better PIP properties than the nonparametric equivalent. For PIP values to be >95% across most of the desired reference interval, sample sizes in excess of 500 are required, and the distribution of observed times should be uniformly distributed across the range. For sample sizes of 500, the nonparametric approach is only marginally less efficient.

To complete the simulation study, we consider model 1 again, but with a gaussian mixture distribution modeling the response variability. This modification to model 1 is referred to as model 4 in Table 1Up . The two-component gaussian distributions are n(–0.0419, 0.10062) and n(0.1676, 0.05032) with mixing rates of 0.8 and 0.2, respectively. These parameters have been chosen to give a mean of zero and a SD of 0.125, the same parameter values as for model 1. We have not attempted to make the mixture distribution extreme but merely to introduce a distribution that is only moderately nongaussian. Scatter plots of 500 points from models 1 and 4 were inspected and found to be visually similar. The nonparametric PIP profile was found to be similar to that for 300 points shown in Fig. 4Up . However, the parametric PIP profile was noticeably different, the PIP value was reduced to ~0.4 across the entire age range. Further investigation revealed that this was attributable to a negative bias in the sampling distribution, the mean of which was below the true 95th percentile. Thus, >50% of the estimates of the 97.5th percentile were below the true 95th percentile.

We believe this example, which is not an extreme one in comparison with those seen in clinical studies, shows how sensitive the parametric approach can be to model assumptions. As a consequence, if it is possible to obtain >=500 data values, uniformly distributed over time, there is much to commend the nonparametric approach.


   Discussion
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 
In this report we have proposed a nonparametric method to identify time-specific reference centiles that follows more closely than other published nonparametric approaches the methodology that underpins parametric methods, namely regression-based methods as described by Royston and coworkers (4)(5)(6)(7)(12)(15)(16). We have applied our approach to three data sets, and our overall conclusions are as follows.

For the female cholesterol data the parametric approach of Royston (4) and Altman(14) gives reasonable centile curves, although the method seems not to respond to minor local structural variations for the 97.5th centile. There is some evidence that hormonal changes in the late teens/early twenties and again in the late forties could give rise to increased cholesterol values. Our model seems to have responded to these potential changes, and overall there does not appear to be a tendency to produce fluctuating profiles.

For the kidney data, the pattern of centile curves produced by our method seems not to display the fluctuating character obtained when the method proposed by Rossiter (9) is used. The nature of our proposed method ensures that the correct number of observations lies between appropriate centile curves, whereas the method proposed by Rossiter has a tendency to make the fitted centile curves too extreme; hence there are too few points below the 3rd percentile and above the 97th percentile.

The AFP data set, which is much less demanding statistically, is modeled reasonably well by both parametric and nonparametric methods. In making these comparisons we are not suggesting that our method is uniformly better for all applications, but when there are sufficient data our proposed method highlights the structure present without exhibiting undue fluctuations. In addition, our method suffers less from the potential to produce biased centile estimates.

In our companion paper (19), we defined a PIP criterion for assessing the effectiveness of any statistical method for establishing reference intervals in the univariate setting. This has been extended to time-specific reference intervals in this report. We have explored the impact that the sampling pattern associated with the time variable has on PIP values as well as the effect of a nongaussian error law to describe response variation. Our conclusion is that the parametric method is more efficient when the model underpinning the data generation is included within the family of models used in the fitting process. However, the parametric method is less capable when the model used for generating the data falls outside of the family assumed by the method. Unacceptable PIP values are seen in the extremes of the reference interval, a consequence of increased prediction error for such regions. In some of the cases this reduced the effective time range over which the reference interval could be reliably used by up to 20%. For both the parametric and nonparametric approaches it is necessary to have data sets of at least 500 values if the PIP is to be adequate over most of the time range, even when the correct statistical models are used. These conclusions have to be changed significantly when nonuniformly distributed time values are considered because the PIP at the extremes of the reference interval are noticeably lower than is the case for uniformly distributed time values. This suggests the need for a well-planned study if an acceptable reference interval is to be established across the whole time range.

Investigation of the impact of a perturbation of the gaussian nature of the variability in response produces surprising results. The PIP property for the parametric approach is very low, ~40%. With the PIP defined on the interval (0.95, 0.99) for the upper 97.5th percentile, it is clear that, on average, 60% of estimators fell below the true 95th percentile, at least doubling the referral rate for the upper reference limit. The nonparametric approach gives considerably higher PIP values. We conclude, therefore, that the parametric method should be used only if there is a high degree of certainty about the underlying distributional properties governing data collection. If there is any doubt then a nonparametric method is advised, with a sample size of 500 giving a PIP value of at least 0.95.


   Acknowledgments
 
We thank Prof. John Matthews of the University of Newcastle-upon-Tyne for making available the data on kidney sizes and the Biochemistry Department at the University Hospital of Wales for providing the AFP data set.


   References
Top
Abstract
Introduction
Materials and Methods
Results
Discussion
References
 

  1. Harris EK, Boyd JC. Statistical bases of reference values in laboratory medicine 1995:16-17 Marcel Dekker New York. .
  2. Solberg HE. International Federation of Clinical Chemistry (IFCC) approved recommendation (1987) on the theory of reference values. Determination of reference limits. J Clin Chem Clin Biochem 1987;25:645-656.[ISI]
  3. Solberg HE. Establishment and use of reference values. Burtis CA Ashwood ER eds. Tietz fundamentals of clinical chemistry 2001:251-261 WB Saunders Philadelphia. .
  4. Royston P. Constructing time-specific reference ranges. Stat Med 1991;10:675-690.[ISI][Medline] [Order article via Infotrieve]
  5. Wright EM, Royston P. Calculating reference intervals for laboratory measurements. Stat Methods Med Res 1999;8:93-112.[Abstract/Free Full Text]
  6. Wright EM, Royston P. Simplified estimation of age-specific reference intervals for skewed data. Stat Med 1997;16:2785-2803.[CrossRef][ISI][Medline] [Order article via Infotrieve]
  7. Royston P, Wright EM. How to construct ‘normal ranges’ for fetal variables. Ultrasound Obstet Gynecol 1998;11:30-38.[CrossRef][ISI][Medline] [Order article via Infotrieve]
  8. Cole TJ, Green PJ. Smoothing reference centile curves: the LMS method and penalized likelihood. Stat Med 1992;11:1305-1319.[ISI][Medline] [Order article via Infotrieve]
  9. Rossiter JE. Calculating centile curves using kernel density estimation methods with application to infant kidney lengths. Stat Med 1991;10:1693-1701.[ISI][Medline] [Order article via Infotrieve]
  10. Healy MJR, Rasbash J, Yang M. Distribution-free estimation of age-related centiles. Ann Hum Biol 1988;15:17-22.[CrossRef][ISI][Medline] [Order article via Infotrieve]
  11. Tango T. Estimation of age-specific reference ranges via smoother AVAS. Stat Med 1998;17:1231-1243.[CrossRef][ISI][Medline] [Order article via Infotrieve]
  12. Royston P, Wright EM. Goodness-of-fit statistics for age-specific reference intervals. Stat Med 2000;19:2943-2962.[CrossRef][ISI][Medline] [Order article via Infotrieve]
  13. Scott JES, Hunter EW, Lee REJ, Matthews JNS. Ultrasound measurement of renal size in the newborn. Arch Dis Child 1990;65:361-364.[Abstract]
  14. Altman DG. Construction of age-related reference centiles using absolute residuals. Stat Med 1993;12:917-924.[ISI][Medline] [Order article via Infotrieve]
  15. Royston P, Altman DG. Regression using fractional polynomials of continuous covariates: parsimonious parametric modelling. Appl Stat 1994;43:429-467.[CrossRef]
  16. Royston P, Wright EM. A method for estimating age-specific reference intervals (‘normal ranges’) based on fractional polynomials and exponential transformation. J R Stat Soc A 1998;161:79-101.
  17. Bowman AW, Azzalini A. Applied smoothing techniques for data analysis 1997:48-52 Oxford Science Publications New York. .
  18. Cuckle AS, Wald NJ, Thompson SS. Estimating a woman’s risk of having a pregnancy associated with Down’s syndrome using her age and serum {alpha}-fetoprotein level. Br J Obstet Gynaecol 1987;94:387-402.[ISI][Medline] [Order article via Infotrieve]
  19. Koduah M, Iles TC, Nix ABJ. Centile charts I: new method of assessment for univariate reference intervals. Clin Chem 2004;50:910-915.
  20. Linnet K. Two stage transformation systems for normalization of reference distributions evaluated. Clin Chem 1987;33:381-386.[Abstract/Free Full Text]
  21. Griffiths JK. Some statistical aspects of constructing time-specific reference ranges [M Phil Thesis] 1999 University of Wales College of Medicine Cardiff, United Kingdom. .
  22. Hall R, Anderson J, Smart GA, Besser M. Fundamentals of clinical endocrinology 1980:324 Pitman New York. .
  23. Bowman AW, Azzalini A. Applied smoothing techniques for data analysis 1997:25-27 Oxford Science Publications New York. .




This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Data Supplement
Right arrow All Versions of this Article:
clinchem.2003.023770v1
50/5/907    most recent
Right arrow Submit an electronic Letter to
the Editor about this paper
Right arrow Alert me when this article is cited
Right arrow Alert me when eLetters are posted
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via ISI Web of Science (3)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Griffiths, J. K.
Right arrow Articles by Nix, A. B.J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Griffiths, J. K.
Right arrow Articles by Nix, A. B.J.
Related Collections
Right arrow Laboratory Management
Right arrow General Clinical Chemistry


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS