|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Endocrinology and Metabolism |
1 Garvan Institute of Medical Research and Department of Endocrinology, Sydney, Australia;2 Australian Sports Drug Testing Laboratory, National Measurement Institute, Sydney, Australia;3 ANZAC Research Institute, Concord Hospital, University of Sydney, Sydney, Australia;4 Kolling Institute of Medical Research, University of Sydney, Royal North Shore Hospital, Sydney, Australia.
aAddress correspondence to this author at: Pituitary Research Unit, Garvan Institute of Medical Research, 384 Victoria St, Darlinghurst 2010 Australia. Fax 0612 9295 8481; e-mail K.Ho{at}garvan.org.au.
| Abstract |
|---|
|
|
|---|
Methods: We measured IGF-I, IGF binding protein 3 (IGFBP-3), acid labile subunit (ALS), and the collagen markers N-terminal propeptide of type I procollagen (PINP), C-terminal telopeptide of type I collagen (ICTP), and N-terminal propeptide of type III procollagen (PIIINP) in serum samples obtained on multiple occasions (median 3 per participant) over a 2- to 3-week period from 1103 elite athletes (699 men, 404 women) ages 22.2 (5.2) years [mean (SD)]. We estimated between-subject and within-subject variances by mixed–effects ANOVA.
Results: Within-subject variance accounted for 32% to 36% and 4% to 13% of the total variance in IGF markers and collagen markers, respectively. The within-subject CV ranged from 11% to 21% for the IGF axis markers and from 13% to 15% for the collagen markers. The index of individuality for the IGF axis markers was 0.66–0.76, and for the collagen markers, 0.26–0.45. For each marker, individuals with initial extreme measured values tended to regress toward the population mean in subsequent repeated measurements. We developed a Bayesian model to estimate the long-term probable value for each marker.
Conclusions: These results indicate that in healthy individuals the within-subject variability was greater for IGF-I than for the collagen markers, and that where a single measurement is available, it is possible to estimate the long-term probable value of each of the markers by applying the Bayesian approach. Such an application can increase the reliability and decrease the cost of detecting GH doping.
| Introduction |
|---|
|
|
|---|
The utility of a marker, both clinically and for a GH doping test, depends on its sensitivity and ability to discriminate a normal value from an abnormal value. The sensitivity and discriminatory values are, in turn, a function of the stability of these markers over time, which can be assessed by within-subject variability and analytic imprecision. Conceptually, the variation of a biochemical marker can be partitioned into 2 major sources, between-subject and within-subject variation, and the latter further partitioned into normal biological variation and analytical (random) imprecision of the method of measurement.
Despite the importance of IGF axis markers in clinical and scientific practice, the sources and magnitude of these markers variability have not been well documented. One recent study in normal subjects reported considerable short-term within-subject differences in IGF-I and indicated the need for caution in the use of risk ratios based on single measurements in the clinical setting (17). Most studies to date on short- to long-term within-individual variability of bone turnover markers were based on older women (18)(19)(20)(21). No data are available, however, on the stability and reproducibility of the IGF axis and collagen markers in young, healthy athletes; such data are required for the application of these markers to a GH doping test.
In this study, we sought to determine the short-term within-subject variability of the markers IGF-I, IGFBP-3, ALS, PINP, ITCP, and PIIINP in a large cohort of >1000 elite athletes. The within-subject variability was partitioned into biologic and analytic variability and was compared to the between-subject variability to assess the implications of this variability in the clinical setting and the use of these markers to detect GH doping in sport.
| Materials and Methods |
|---|
|
|
|---|
measurements and laboratory analysis
Blood samples were collected from volunteers on a casual basis, that is, at random with regard to the time of day, food intake, exercise, and competition, as described (22), representing the out-of-competition setting. The day and time of collection and the time since the athlete had last exercised or competed were recorded. We collected 3 venous samples on average from each athlete over a 2- to 3-week period. The median number of samples per subject was 3 (range 1–10).
We stored serum samples at –80 °C before analysis. The stability of these analytes after storage at –80 °C has been indicated by previous studies (24)(25). We measured IGF-I by RIA after acid-ethanol extraction (26) and IGFBP-3 and ALS using polyclonal antibodies (27)(28). The intraassay CVs were IGF-I 5.9%, IGFBP-3 4.6%, and ALS 5.4%. The markers ICTP (intra- and interassay CVs <10%), PINP (intra- and interassay CVs <9% and <12%, respectively), PIIINP (intra- and interassay CVs <7% and <12%) were measured in serum by use of competitive RIAs (Orion Diagnostica) in the same assay batch.
data analysis
The study design—collection of multiple measurements per subject over a relatively short time—allows comparison of the magnitude of variation that exists between individuals relative to the magnitude of variation within each individual. Accordingly, the total variance of each marker was partitioned into 2 broad components, between-subject and within-subject variance, with the latter being further partitioned into 2 subcomponents, 1 due to biological variation and 1 due to analytic variation. The within-subject variability was subsequently transformed into the original unit of measurement to yield the SD and then expressed as the CV relative to the mean. Estimates of variance components were obtained using a compound symmetric covariance structure in the mixed-effects ANOVA, using the nlme package within the R language (29). Full details of the analysis are shown in the Data Supplement (see Data Analysis) that accompanies the online version of this article at http://www.clinchem.org/content/vol54/issue8.
In addition, we computed the coefficient of reliability (R), which is defined as the ratio of the between-subject variance to the total variance (the sum of between- and within-subject variances) (30), for each marker. This coefficient can be viewed as the degree to which individuals marker values remain relatively consistent over repeated measurements, or as a measure of the correlation between probable values and measured values. We also computed the index of individuality as described in the online Data Supplement (see Data Analysis). The presence of within-subject variability raises the issue of regression to the mean (RTM) (31) and the probable value for an individual. Therefore, in a further analysis, we assessed the RTM effect by comparing the change between consecutive measurements, and we estimated the probable value as described (32) using a Bayesian model, further detailed in the online Data Supplement (see Data Analysis).
| Results |
|---|
|
|
|---|
|
In all markers, the within-subject SDs were positively correlated with their means (r = 0.35–0.57), which violated the assumption of homogeneity of variances. Therefore, subsequent analyses were performed on natural logarithmic–transformed values. All log-transformed IGF axis and collagen markers were normally distributed, with no appreciable difference in the between-subject variances across age groups.
variance components
Between-subject variance formed a major component of the variance of the markers, accounting for 64% to 68% of the total variance of IGF axis markers and 87% to 96% of total variance of collagen markers (Table 2
). Thus, within-subject variance accounted for between 4% and 36% of the total variance of the markers and was higher for the IGF axis markers (32%–36%) than for the collagen markers (4%–13%). Most of the within-subject variance was attributable to biological variability (80%–95% of the within-subject variance).
|
The short-term within-subject SD was relatively greater for IGF-I (CV 20.7%) than for IGFBP-3 and ALS (CV 11.3% and 11.6%, respectively) and ranged from 13% to 15% for the 3 bone turnover markers (Table 3
). The CV due to analytic variability, which ranged from 3% to 6.1%, was smaller than the CV for biological variability (10%–20%). The coefficient of reliability was higher for the collagen markers (R 0.87–0.96) than for IGF axis markers (R 0.64–0.68) (Table 4
), indicating that the within-subject variability was smaller relative to between-subject variability for the collagen markers. The index of individuality for all IGF axis markers was >0.60, whereas the index of individuality for all collagen markers was <0.6. None of the markers had an index of individuality >1.4 (Table 4
).
|
|
regression to the mean effects
To assess the effect of RTM, we conducted an analysis of changes for subjects who had exactly 3 measurements. In this analysis, for each marker we categorized the first measurement into 3 groups by the top 10% and bottom 10% percentiles; then, in each category, we calculated the mean of absolute change in each markers values between the 3 measurements (Table 5
). As expected from the regression-to-the-mean effect, individuals with initial measurement in the bottom 10% percentile for the marker subsequently increased on average; on the other hand, those in the top 10% percentile subsequently decreased. For example, among individuals with initial IGF-I <107 µg/L (n = 76), there was a mean increase of 10.5 (SD 26.9) µg/L between measurements 2 and 1, which then regressed toward the mean with a mean change of –0.8 µg/L between measurements 3 and 2. On the other hand, among individuals with initial IGF-I >230 µg/L (n = 74), a decrease of 65.3 (SD 62.6) µg/L between measurements 2 and 1 was observed, and this decrease was less in the subsequent measurements.
|
In an additional analysis, changes between measurement 2 and measurement 1 (
2–1) were correlated with changes between measurement 3 and measurement 2 (
3–2). In each subject,
2–1 was classified as decreased if the percent decrease was more than within-subject 1SD, increased if the percent change was more than 1SD, or unchanged if the change was within 1SD (Supplemental Table 6). The common trend emerging from this analysis was that many subjects whose markers values exhibited a decrease during the first 2 measurements (negative
2–1), actually increased in subsequent measurements (positive
3–2), and vice versa. For example, for IGF-I, 22% (168/751) of subjects experienced decreased
2–1, and among these subjects, 18% (31/168) subsequently experienced increased IGF-I values
3–2. On the other hand, 13% (96/751) of subjects experienced increased
2–1 IGF-I, and of these 55% (53/96) subsequently decreased between measurements 3 and 2 [
3–2 –12.4% (21%)].
estimates of probable values
Using the Bayesian model as described in the online Data Supplement, we estimated probable values of each marker (Fig. 1
). As expected from the model, individuals with observed marker values higher than the population mean had lower probable values than those actually measured, whereas individuals with measured values lower than the population mean had higher probable values. For example, an individual with measured IGF-I of 60 µg/L (about 2SD less than population mean) can be expected to have a probable value of 84 µg/L, whereas for an individual with measured IGF-I 250 µg/L (about 2SD greater than population mean), the expected probable value is 200 µg/L. Individuals with measured values closer to the population mean had estimated probable values closer to the measured values. As expected, for markers with higher within-subject variability (such as IGF-I), the observed and estimated probable measurements were further apart than for those markers with lower within-subject variability (such as PINP). The application of the concept of probable value in the detection of abnormal values is further illustrated in the online Data Supplement.
|
| Discussion |
|---|
|
|
|---|
Although markers of the IGF system have been used extensively in the clinical setting, few studies have examined their reliability. In a study where IGF-I measurements were repeated after 2 weeks in 84 normal subjects, substantial within-subject variation between measurements was observed, resulting in changes from one quartile to another for 40% of the subjects, although a formal estimate of variability was not provided (17). In the current study, the within-subject variability for IGF-I was also high (within-subject CV 21%), whereas the within-subject CVs of IGFBP-3 and ALS were lower (11% and 12%, respectively). The underlying reasons for the relatively high within-subject variation in IGF-I in this study are unknown, but it is interesting that most of the within-subject variation was due to normal biological variation, and the imprecision of measurement accounted for only 6%. The estimates of the short-term within-subject variability in serum bone turnover markers in this study (within-subject CV 13%–15%) are similar to previous estimates in older women and compare favorably to the greater variability observed in other markers such as the C- and N-terminal telopeptides of type I collagen, CTx and NTx (20)(21)(33).
Increased serum concentrations of IGF axis and collagen markers in response to GH administered to young recreational athletes (12)(13)(14)(15) have indicated the potential for these markers in detecting GH abuse in sport. Use of these markers in a GH-doping test requires estimation of between-subject variability in elite athletes to define demographically relevant reference ranges, and we have previously demonstrated the influence of demographic factors and sport type on between-subject variation in the same cohort of elite athletes (23). There has been no previous formal evaluation of within-subject variability of the GH-responsive markers, although others have suggested that a single measurement of serum concentrations of IGF-I and IGFBP-3 may not be sufficiently reliable for an antidoping test (34). Concentrations of the markers IGF-I, ICTP, and PIIINP in 47 athletes measured over a 6-month period, entailing periods of training, competition, recovery, and rest, have been reported to be fairly stable; however, within-subject variability was not formally assessed (35).
Although the index of individuality has classically been interpreted in the context of assessing the utility of the population-based reference interval in relation to the interpretation of an individuals measurements (36), we consider that it can also be used in the reverse sense to judge how closely a single measurement is likely to estimate the mean (true or set point) value of an analyte in the individual. Thus, an index of individuality <0.6 indicates that the population-based reference range is of limited value for the interpretation of measurements in an individual, because test results that are abnormal for the individual can be undetected by the population-based reference range. In such a situation, however, a single measurement will be a much better estimate of the true value for the individual than the mean for the population at large. As the index of individuality increases >0.6, the probability that an abnormal result in an individual will fall outside the population-based reference range progressively increases until at index values above 1.4 it increases to >95%. Obviously, for an analyte with high index of individuality (e.g., where the variability of measurements within an individual becomes just as large as or larger than the variability between individuals), the utility of a single measurement in an individual is more limited, and in this case the Bayesian approach presented here is useful in estimating the individuals true value because it combines the mean and variance of the individual and the population from which the individual was drawn.
The phenomenon of regression to the mean was observed in this study, in which individuals with extreme measured values at the first measurement on average tended to regress toward the mean in subsequent measurements, such that lower/higher measured values at measurement 1 tend to be higher/lower, respectively, at subsequent measurements in the absence of any biological effect. This phenomenon was largely attributable to the within-subject variability of each marker, which has important implications in clinical practice, because measurements of IGF axis and bone turnover markers can potentially be used to make a diagnosis of GH deficiency or detect GH doping in sport.
The RTM phenomenon has practical importance for the application of a GH diagnostic test. First, a single measurement may result in false-positive or false-negative classification for an individual. Second, the substantial within-subject variability observed in the IGF axis markers suggests that any diagnosis or antidoping test based on these markers should be based on multiple measurements rather than a single measurement. In practice, however, multiple measurements are not always possible, and the Bayesian approach described in this report can be a useful means to approximate the probable value of a marker for an individual.
The present results should be interpreted within the context of a number of strengths and weaknesses. The study was based on a large sample size with multiple measurements, which allows more accurate and reliable delineation of between- and within-subject variability than studies with small sample sizes. The study participants were all healthy, without any diseases that are likely to affect bone metabolism, and were drawn from all ethnicities. Because there were no significant differences in the estimates of within-subject variability between sexes, age groups, ethnicities, or sport types, the present results could be applied to all young, healthy adults, provided that they are subject to the same method of analysis. Nevertheless, the current study did not attempt to control for other variables that are known to affect IGF axis and collagen markers such as exercise and training (37)(38), food intake (39), and diurnal variation (3)(40). These factors may have contributed to greater within-subject variability in our study. In the clinical setting, controlling for such factors and thus reducing the variability, for example by measuring samples collected in the fasting state or at the same time of day, may be feasible, but in the doping situation would be impractical. Our estimates of within-subject variability using samples collected at random with regard to variables such as time of day and exercise are more likely to reflect the variability in the doping assessment context. Furthermore, in this study, participants were required to declare that they had not taken GH; however, there was no independent verification of the truthfulness of their statement. Assuming that that a small minority of subjects may have taken growth hormone, we would expect their markers to be increased, but this would have a negligible effect on the reliability or intrasubject variability of these markers.
This study addresses the challenge of detecting abnormal values with certainty, posed by the presence of within-subject variability in the IGF axis and collagen markers. Even if an individuals marker value is slightly higher than a threshold of abnormality, it may not be possible to ascertain whether the individuals value is abnormal because of within-subject variability. However, the Bayesian approach presented in this report can be used to estimate the probability of abnormality for a given measured value. In essence, the Bayesian approach addresses the following question: "Given a measured marker value for an individual, what is the individuals probable value?" The RTM effect predicts that an individuals extreme value will regress toward the population average; therefore, the probable value can be estimated from the weighted average of population mean and the individuals value, with the weight being the coefficient of reliability. However, this probable value is also subject to sampling variation within the individual. Assuming that the distribution of this further variation is normal, it is possible to estimate the probability that a measured value is above a certain threshold of abnormality. The present study provides all essential parameters required for the estimation.
In conclusion, these results indicate that the within-subject variability in healthy individuals was greater for IGF-I than for the collagen markers. Where a single measurement is available, our findings demonstrate that it is possible to estimate the long-term probable value of each of the markers by applying the Bayesian approach. This modeling strategy not only enhances the reliability and reduces the cost of GH doping tests based on the use of these markers, but has diagnostic applicability beyond doping in sports.
| Acknowledgments |
|---|
Financial Disclosures: None declared.
Acknowledgments: We thank Drs. Kin-Chuen Leung and Graham Trout for their contributions to the project and Kevin Hardman, Sri Meka, and James Modzelewski for their technical contribution. We thank Nguyen D. Nguyen for his help in data analysis and graphical presentation and James Boyd for helpful editorial input in the finalization of the manuscript.
| Footnotes |
|---|
| References |
|---|
|
|
|---|
The following articles in journals at HighWire Press have cited this article:
![]() |
S. Bhasin, E. J. He, M. Kawakubo, E. T. Schroeder, K. Yarasheski, G. J. Opiteck, A. Reicin, F. Chen, R. Lam, J. A. Tsou, et al. N-Terminal Propeptide of Type III Procollagen as a Biomarker of Anabolic Response to Recombinant Human GH and Testosterone J. Clin. Endocrinol. Metab., November 1, 2009; 94(11): 4224 - 4233. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Bidlingmaier, J. Suhr, A. Ernst, Z. Wu, A. Keller, C. J. Strasburger, and A. Bergmann High-Sensitivity Chemiluminescence Immunoassays for Detection of Growth Hormone Doping in Sports Clin. Chem., March 1, 2009; 55(3): 445 - 453. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. T. Bennett There's Nothing to Winning, Really Clin. Chem., August 1, 2008; 54(8): 1265 - 1267. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |