|
|
||||||||
Articles |
1
National Center for Environmental Health, Centers for Disease Control and Prevention, Atlanta, GA 30341.
a Address correspondence to this author at: Centers for Disease Control and Prevention, 4770 Buford Hwy., NE, MS F-18, Atlanta, GA 30341-3724. Fax 770-488-4609; e-mail cfp8{at}cdc.gov
| Abstract |
|---|
|
|
|---|
Methods: Using gas chromatography/mass spectrometry, 13 laboratories participated in a 2-day analysis of 8 serum and 11 EDTA-plasma specimens. Results were analyzed for imprecision, recovery, and differences among laboratories and methods.
Results: The mean among-laboratory imprecision (CV) was 19% and 21% for serum and plasma samples, respectively, and 9.3% and 7.8% for serum and plasma samples with added MMA, respectively. The mean within-laboratory (among-run) CV was 13% for both serum and plasma samples and 5.2% and 4.9% for serum and plasma samples with added MMA. Within-method imprecision was the same or higher than among-method imprecision. The mean among-laboratory recovery of MMA was 105% and 95% in serum and plasma, respectively. Most laboratories showed a proportional bias relative to the consensus mean of up to 15%. Two laboratories reported results that on average were almost 30% higher than the consensus mean.
Conclusions: No method differences were found, but significant among-laboratory imprecision was found in the present study. Improvements are needed to reduce the analytical imprecision of most laboratories, and attention must be focused on calibration issues. Differences among laboratories can be improved by introducing high-quality reference materials and by instituting external quality assessment programs.
| Introduction |
|---|
|
|
|---|
To assess the comparability of results among laboratories and within laboratories (among-run), CDC invited national and international clinicians and laboratorians who routinely measure MMA in serum or plasma to participate in a round-robin interlaboratory comparison study. Each participant was asked to analyze aliquots of 8 serum and 11 EDTA-plasma specimens on 2 days.
| Materials and Methods |
|---|
|
|
|---|
10 000 nmol/L) on 2 days. The participants included
clinical research facilities, academic laboratories, one clinical
reference laboratory, and one government laboratory. Laboratories 2 and
5 analyzed serum specimens only.
specimens
Under a CDC agreement with the Emory University Hospital Blood
Collection Service (including an omnibus informed consent and Human
Subject Review protocol), one unit of blood was collected from four
apparently healthy subjects and centrifuged after 2 h. One-half of
the serum volume from each subject (100 mL) was pooled and then
redivided into four equal aliquots; one aliquot was retained as native
sample, and MMA was added to the other three aliquots, corresponding to
an increase in concentration of 500, 2000, and 10 000 nmol/L. Thus, we
had five serum specimens without added MMA and three serum specimens
with added MMA. One unit of potassium EDTA blood was collected
from eight apparently healthy subjects and centrifuged after 30 min.
Two of the plasma specimens were pooled and then redivided into two
equal aliquots; one aliquot was retained as native sample, and MMA was
added to the other aliquot, corresponding to an increase in
concentration of 5000 nmol/L. Three CDC in-house human EDTA-plasma
quality-control pools were also included (120, 1800, and 10 130
nmol/L), two of which contained added MMA. Thus, we had eight plasma
specimens without added MMA and three plasma specimens with added MMA.
Both serum and plasma specimens were stored for a maximum of 1 month at
-70 °C before they were shipped on dry ice to participating
laboratories. An aliquot of each specimen was analyzed for MMA by the
CDC NHANES laboratory, using a gas chromatography/mass spectrometry
(GC/MS) assay (8)(9) with plasma quality-control
pools at three concentrations (120, 1800, and 10 130 nmol/L).
statistical methods
We tested for outliers by calculating the all-laboratory consensus
mean ± 2.58 SD (99% probability) for each sample and comparing
each individual result with this range. Results for three plasma
specimens from laboratory 6 and for two plasma specimens from
laboratory 11 were outside of this range and thus were excluded from
all further calculations.
Evaluations of among-laboratory imprecision and of differences among laboratories and methods were based on the mean between day 1 and day 2 results, whereas evaluations of among-run (within-laboratory) imprecision and of recovery were based on the individual results of day 1 and day 2. The following measures of imprecision were evaluated: (a) among-laboratory, among-method; (b) among-laboratory, within-method; and (c) within-laboratory, among-run. We tested whether the variance or the relative SD (CV) was more constant across the whole concentration range of all samples and found that the variance increased with increasing MMA concentration, whereas the CV decreased with increasing MMA concentration. However, if we subdivided the samples into two groups, samples without added MMA and samples with added MMA, the variance still increased with increasing MMA concentration within each group, but the CV corrected for the increasing variance and was relatively constant within each group. Thus, we calculated for each sample the among-laboratory CV of the participating laboratories. We expressed the imprecision as the mean CV (SD) for each sample group (native serum, native plasma, serum with added MMA, and plasma with added MMA). We calculated the among-laboratory, within-method imprecision for only two method groups because one group was represented by one laboratory only.
In the absence of target values for the samples analyzed and because all laboratories used a GC/MS method, we arbitrarily considered the consensus mean of all laboratories as a point of reference. Possible systematic biases were assessed by computing 95% confidence intervals (mean difference ± confidence limit) for the mean differences between the consensus results and each laboratorys results (10)(11). We assessed limits of agreement by calculating the central 0.95 intervals (mean difference ± 2 SD). The mean differences and the mean between the consensus results and each laboratorys results were correlated to test for a relationship between these two variables. To assess the mean proportional bias between the consensus results and each laboratorys results, we calculated the relative ratios of the consensus and test-laboratory results.
We calculated recoveries individually for each sample containing added MMA: recovery (%) = (specimen with added MMA - specimen without added MMA)/added concentration of MMA. Recovery results were reported as the mean (SD) of the added MMA concentrations over the 2 days of analysis.
To test for methodological differences, we grouped laboratories by method of analysis (GC/MS using cyclohexanol/HCl, silylation, or ethylchloroformate as derivatization reagent) and performed a two-way ANOVA with laboratory and analytical method as variables using the SAS GLM procedure and the Bonferroni test (to correct for multiple comparisons). P < 0.05 was considered statistically significant.
| Results |
|---|
|
|
|---|
|
imprecision
The mean among-laboratory CVs are similar for serum and plasma
specimens without added MMA but are approximately one-half for serum
and plasma specimens with added MMA (Table 2
). For the GC/MS method using cyclohexanol/HCl for
derivatization, the among- laboratory, within-method CV was
almost the same as the among-laboratory, among-method CV. However, for
the GC/MS method using silylation for derivatization, the
among-laboratory, within-method CV was higher than the
among-laboratory, among-method CV. Although differences among
laboratory means were statistically significant for all four sample
groups, for serum and plasma without added MMA they were highly
significant (P = 0.0001).
|
Approximately one-half of the laboratories showed among-run CVs higher
than 10% for serum and plasma specimens without added MMA (Table 3
). For serum and plasma specimens with added MMA, only ~10%
of the laboratories obtained among-run CVs higher than 10%.
|
recovery
The mean among-laboratory differences between the three serum
samples containing added MMA and the native serum sample were not
significantly different from the expected results of 500, 2000, and
10 000 nmol/L, respectively (P = 0.85, 0.19, and 0.25,
respectively). The mean among-laboratory difference between the plasma
sample containing added MMA and the native plasma sample was
significantly different from the expected result of 5000 nmol/L
(P = 0.01); it was 5.4% lower than expected.
Recoveries were 85115% for laboratories other than laboratory 7
(127% ± 8.4%; Table 3
). The mean among-laboratory recovery was
higher in serum (105%) than in plasma (95%), but in both matrices
approached complete recovery.
differences among laboratories and methods
Fig. 1
shows the performance of each laboratory for native serum and
plasma samples (Fig. 1A
) and for serum and plasma samples with added
MMA (Fig. 1B
). We obtained very similar graphs when we examined serum
specimens only or plasma specimens only (data not shown).
|
Among-laboratory differences were assessed with native samples only to
avoid unbalanced results by a few samples with high concentrations.
Because of the relatively small number of samples in this study, we
reported the median, minimum, and maximum differences between the
consensus results and each laboratorys results in addition to the
mean differences (Table 4
). The 95% confidence intervals of the mean differences showed
an apparent positive bias for laboratories 1, 4, 6, and 13, and an
apparent negative bias for laboratories 3, 5, 7, and 912.
Laboratories 2 and 8 showed no apparent bias with respect to the
consensus mean.
|
The central 0.95 interval (mean difference ± 2 SD) gives an indication of the agreement between the consensus results and each laboratorys results for MMA measurements in serum and plasma. Ninety-five percent of MMA determinations by laboratories 1 and 6 were 16.777.4 nmol/L and 6.778.9 nmol/L higher than concentrations determined by the consensus mean. This corresponds to a proportional bias of 29.7% and 27.3%, respectively. Ninety-five percent of the results from all other laboratories were up to ~15% lower or higher than the consensus mean. When we correlated the mean differences and the mean between the consensus results and each laboratorys results, we found a relationship between these two variables for laboratories 7 (r2 = 0.801), 12 (r2 = 0.628), and 13 (r2 = 0.632). For all other laboratories, r2 was <0.4.
When we grouped laboratories by method, we found no significant difference between the results of the method groups.
| Discussion |
|---|
|
|
|---|
The mean among-laboratory CVs of 19% and 21% for serum and plasma samples without added MMA in this study are in good agreement with the 18% and 17% among-laboratory CVs found by Møller et al. (7) for one serum and one heparin/NaF plasma specimen, respectively. For serum and plasma samples with added MMA, we found smaller among-laboratory CVs (9% and 8%) than did Møller et al. (17% and 11%) (7).
The present study, in which each laboratory and method used its own calibrators, has demonstrated again that among-laboratory, within-method imprecision can exceed among-laboratory, among-method imprecision. The same observation was made in our recently performed homocysteine interlaboratory comparison study (12). Of the nine laboratories performing GC/MS with cyclohexanol/HCl derivatization, eight showed no or an apparent small negative or positive bias relative to the consensus mean, whereas one laboratory reported results on average 30% higher than the consensus mean. Of the three laboratories performing GC/MS with silylation for derivatization, two showed an apparent small negative proportional bias relative to the consensus mean, whereas one laboratory reported results on average 27% higher than the consensus mean. This suggests that differences among laboratories are most likely related to calibration issues. Although calibration curves for MMA are linear over a wide concentration range, the intercept of the calibration curve increases with an increasing calibration range. Thus, the calibration range must be optimized to obtain the most accurate results for the clinically critical MMA concentrations. If the calibration range is too wide, low MMA concentrations might be underestimated. If the calibration range is too narrow, high MMA concentrations might be overestimated. It seems reasonable to strive for highest accuracy at low and slightly increased MMA concentrations (up to 1000 nmol/L) because diagnosis of B12 deficiency is more difficult to determine at these concentrations.
Although we found no correlation between the performance of the different laboratories and the calibration range or the matrix in which calibration was performed, results indicated that it might be inaccurate to use the internal standard d3-MMA as a calibrator. Two of the three laboratories that did not calibrate with MMA reported results that were on average almost 30% higher than the consensus mean. Thus, if the results of those two laboratories were excluded from the consensus mean, laboratories now showing an apparent small negative bias would no longer show this bias.
Objective analysis of whether the imprecision and bias of a
method are satisfactory is difficult to perform. Some have proposed
using the biological variation as the basis for analytical
quality specifications (13). Rasmussen and co-workers
(14, 15) found a CVwithin-subjects
and a CVbetween-subjects of 13% and 32%,
respectively, for MMA. A widely held view is that analytical
imprecision (CVA) should be <0.25
CVwithin-subjects for optimum performance, <0.50
CVwithin-subjects for desirable performance, and
<0.75 CVwithin-subjects for minimum performance
(16). In our analysis, this would require an analytical
imprecision of <3.2%, <6.5%, and <9.7% for optimum,
desirable, and minimum performance, respectively. As shown in Table 3
,
none of the laboratories obtained among-run CVs <3.2% for all four
groups of samples, and only two laboratories (laboratories 5 and 7) had
overall among-run CVs <6.5%. The among-run CVs of five laboratories
(laboratories 4, 6, 9, 11, and 12) exceeded the required CV for minimum
performance of 9.7% for at least two groups of samples. The following
laboratories performed best regarding analytical imprecision:
laboratories 3, 5, 7, and 13. Each reached the required CV for minimum
performance for all four groups of samples.
The bias of a method (BA) should be
<0.125(CVwithin-subjects2
+
CVbetween-subjects2)1/2
for optimum performance,
<0.25(CVwithin-subjects2 +
CVbetween-subjects2)1/2
for desirable performance, and
<0.375(CVwithin-subjects2 +
CVbetween-subjects2)1/2
for minimum performance (16). For our analysis, this
would mean a bias of <4%, <8%, and <12% for optimum, desirable,
and minimum performance, respectively. As shown in Table 4
, only four
laboratories met the requirement for minimum performance (laboratories
2, 4, 8, and 10) compared with the consensus mean. It must be
cautioned, however, that the consensus mean may itself be biased and
that until high-quality reference methods for MMA are available, little
can be said about a laboratorys or a methods bias.
This international round robin for serum and plasma MMA showed no method differences, but it did show significant among-laboratory imprecision among some of the most experienced laboratories using different GC/MS methods. The analysis for analytical quality specifications has shown that there is an urgent need to improve analytical imprecision. In addition, the analysis for among-laboratory imprecision suggests that most of this variation can be attributed to calibration issues. Differences among laboratories can be improved by introducing high-quality reference materials and instituting more external quality assessment programs.
| Acknowledgments |
|---|
| References |
|---|
|
|
|---|
The following articles in journals at HighWire Press have cited this article:
![]() |
L. M Rogers, E. Boy, J. W Miller, R. Green, J. C. Sabel, and L. H Allen High prevalence of cobalamin deficiency in Guatemalan schoolchildren: associations with low plasma holotranscobalamin II and elevated serum methylmalonic acid and plasma homocysteine concentrations Am. J. Clinical Nutrition, February 1, 2003; 77(2): 433 - 440. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. J. Magera, J. K. Helgeson, D. Matern, and P. Rinaldo Methylmalonic Acid Measured in Plasma and Urine by Stable-Isotope Dilution and Electrospray Tandem Mass Spectrometry Clin. Chem., November 1, 2000; 46(11): 1804 - 1810. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |