|
|
||||||||
Automation and Analytical Techniques |
1 Inorganic Toxicology and Nutrition Branch, Division of Laboratory Sciences, National Center for Environmental Health, Centers for Disease Control and Prevention, Atlanta, GA.
aAddress correspondence to this author at: Centers for Disease Control and Prevention, 4770 Buford Hwy, Atlanta, GA 30341. Fax 770-488-4139; e-mail CPfeiffer{at}cdc.gov.
| Abstract |
|---|
|
|
|---|
Methods: Ten laboratories experienced in clinical vitamin B6 analysis participated in a 3-day analysis of 69 serum and 3 aqueous specimens for pyridoxal 5'-phosphate (PLP). Laboratories used either HPLC-based or enzymatic assays. Results were analyzed for imprecision, recovery, and bias relative to consensus means.
Results: Among laboratories, mean within-day CVs (3 specimens x 3 measurements/day) were 0.6%37% and between-day CVs (20 specimens x 1 measurement/day x 3 days) were 1.4%26%. Mean recoveries of added PLP were 53%144%, and mean sample pool mixing recoveries were 75%119%. Consensus means calculated for 20 serum specimens gave mean relative biases between measurement of 10.0% to 24.3% among participating laboratories over a range of 15.8319 nmol/L PLP. Measurement imprecision and biases were evaluated against empirically derived performance criteria based on biological variation. Three of 10 laboratories met optimum imprecision requirements and had 90% or more of measurements satisfy optimum criteria for biases among methods. All 10 laboratories met minimum imprecision requirements, but 25%53% of the results reported by 4 of the 7 suboptimal laboratories failed to satisfy the minimum criteria for bias.
Conclusion: Agreement among vitamin B6 methods is good, but large differences in laboratory proficiency exist, pointing to the need for vitamin B6 reference materials and external quality assurance programs.
| Introduction |
|---|
|
|
|---|
Because much interest currently surrounds the role vitamin B6 plays in reducing the risk of coronary heart disease(16)(17)(18) and stroke(19)(20)(21), an objective comparison of the various methods used for clinical vitamin B6 measurements would be prudent. In response, the US Centers for Disease Control and Prevention invited domestic and international clinicians and laboratorians who routinely measure vitamin B6 in plasma or serum, as well as manufacturers of commercially available assays, to participate in a study designed to assess the performance and comparability of methods used for measuring vitamin B6 in serum.
| Materials and Methods |
|---|
|
|
|---|
70 serum specimens for vitamin B6 over 3 days. The minimum requirement for participation was the ability to measure the PLP concentrations of these specimens, although measurement of other B6 vitamers and metabolites was encouraged. This invitation was extended to a total of 25 laboratories.
serum samples
Serum samples distributed to participants in this study were prepared from serum pools obtained from Solomon Park Research Institute and from pools generated in house. All in-house serum collection was conducted in accordance with an internal review-boardapproved human subjects protocol. In-house serum pools were created from sera collected from on-site volunteers by venipuncture into evacuated collection tubes (Vacutainer SSTTM Tubes; Becton Dickinson). Approximately 30 mL of serum was collected from each volunteer. Sera were processed according to the instructions provided by the collection tube manufacturer with all sample processing conducted under yellow lighting at room temperature. Once processed, a 350-µL aliquot of each serum sample was reserved for screening purposes, and the bulk of each sample was stored immediately at 80 °C. Serum samples were screened for their PLP concentrations in house by a validated method(7), and pools were proposed based on these preliminary values. Serum samples were then thawed, combined in various proportions, and filtered through sterile gauze to remove fibrin and other particulates. Pools were then dispensed while being stirred and immediately stored at 80 °C. Serum pools from Solomon Park Research Institute were received as blinded samples in 1-mL aliquots. Samples were drawn at random from these pools and screened in house by a validated method(7) for their PLP concentrations and homogeneity among aliquots (2 specimens/day x 6 days).
Participating laboratories received a total of 69 serum samples divided into 3 groups (days) according to the schedule appearing in Table 1
. Each sample consisted of a 1-mL aliquot in a 2-mL cryogenic polypropylene vial (Sarstedt) labeled with an encrypted 7-digit identifier. Each day also included an aqueous sample consisting of 40.4 nmol/L PLP (Sigma-Aldrich) dissolved in distilled deionized water (AquaSolutions) filtered through a 0.45 µm filter. The aqueous sample was prepared by serial dilution of a 100 mg/L aqueous stock solution prepared from dry PLP, and its concentration was verified spectrophotometrically by absorbance measurements at 388 nm (
= 5070 L · cm1 · mol1)(22) made on intermediate dilutions (10, 20, and 30 mg/L) of the stock solution in 25 mmol/L phosphate buffer (pH 7.0). Samples were frozen and stored at 70 °C before distribution and were shipped within 48 h over dry ice to participants. Participants were instructed to treat all serum and aqueous samples as unknown specimens. Laboratories were also instructed to analyze the samples in groups (days) as indicated; however, samples within a specific group could be analyzed in any order. Beyond these instructions and some general specimen handling and storage recommendations, participants were free to analyze the samples as determined by their methods. Laboratories were given
3 weeks from the date that samples were received to report their results.
|
statistical analysis
All statistical analyses were performed with SAS 9.0 (SAS Institute Inc.). Except in the calculation of consensus means for each of the serum pools (for which observations had to meet certain statistical criteria to be included), all calculations described here were performed using all reported observations.
Within-day imprecision was evaluated by determining the CVs for each of pools 21, 22, and 23. Mean within-day imprecision was calculated as the mean CV (SD) with 95% confidence limits (CLs) for these pools. Mean between-day imprecision was calculated in an analogous manner, using the CVs calculated for each of pools 120.
Analytical recovery was evaluated both by the addition of exogenous PLP at 2 concentrations and by mixing serum pools. Recoveries of exogenous PLP were calculated on each day as follows:
![]() | (1) |
Serum pool mixing recovery was calculated on each day as follows:
![]() | (2) |
Mean recoveries were expressed as the mean (SD) with 95% CLs of the individual recoveries calculated for each day.
In the absence of PLP reference values for the serum pools used in this study, consensus means were calculated for each pool and used as surrogates. Consensus means were determined by a recursive testing procedure for outlying observations. The means and SDs for serum pools 120 were first calculated using all reported observations. Using a 99% probability (mean ± 2.576 SD), outlying observations were rejected and the consensus means and SDs were recalculated. This process was repeated until all observations fell within the 99% probability limits of their respective consensus means, or until 3 iterations of this process had been completed, whichever occurred first.
Absolute laboratory measurement bias was calculated by determining the differences between the consensus means and the results from each laboratory for serum pools 120 over all 3 days. Relative bias was assessed in an identical manner by use of relative ratios. Overall absolute and relative biases were reported as the means (SDs) with 95% CLs.
performance criteria based on biological variation
Between-day imprecision and relative bias were evaluated against empirically derived performance criteria based on biological variation. Analytical variability (CVA) was assessed using the following equation(23), with the mean between-day CV taken to be equivalent to CVA:
![]() | (3) |
Proportional bias (Ba) was assessed using the following equation(23):
![]() | (4) |
| Results |
|---|
|
|
|---|
|
The PLP concentrations reported by each participating laboratory for the serum pools are presented in the Data Supplement that accompanies the online version of this article athttp://www.clinchem.org/content/vol51/issue7/ (Table S1 for between-day serum pools, Table S2 for within-day serum pools, Table S3 for aqueous reference pool). The results reported for laboratories H2, H6, H7, E1, and E2 represent the mean of 2 replicates, whereas all other results are from single measurements. In addition to PLP, results for 4-pyridoxic acid were reported by laboratories H4, H6, and H7. Laboratory H4 also reported values for pyridoxal, pyridoxamine, and pyridoxine. Meaningful interlaboratory statistical analysis of these results was not possible because of the small number of laboratories reporting values for these analytes; consequently, these data were not included in this report.
imprecision
Laboratory within- and between-day imprecision is summarized in Table 3
. Mean within-day CVs of 0.6%37% and between-day CVs of 1.4%26% were observed among participants. The performance criteria met by the mean between-day CVs for each laboratory appear in Table 3
(if different from its corresponding mean, the criterion met by the upper 95% CL of the between-day CV appears in parentheses). All 10 laboratories met at least minimum performance requirements, with 4 HPLC laboratories (H1, H2, H6, and H8) achieving optimum status.
|
recovery
Standard addition recoveries for PLP ranged from 56% to 128% at an added concentration of 32.1 nmol/L, and from 53% to 144% at an added concentration 64.2 nmol/L (Table 4
). Recoveries observed within each laboratory were generally better at the higher of the 2 added concentrations. Of all 10 participants, only 2 laboratories (H3 and H8) achieved a mean standard addition recovery within 10% of quantitative recovery (100%) at the lower added concentration, but 5 laboratories (H2, H3, and H6H8) were within 10% of quantitative recovery at the second concentration. Mean mixing recoveries ranged from 75% to 119% (Table 4
). Six laboratories (H1H3 and H6H8) had mean mixing recoveries within 10% of quantitative recovery.
|
consensus means
Because the serum pools used in this study lacked PLP reference values and a true reference method for determining PLP does not exist, consensus means were calculated from the results submitted by participating laboratories for pools 120 and used as surrogates. Preliminary inspection of these results indicated the possible existence of extreme outlying observations (Table S1 of the online Data Supplement). To identify outlying observations and thus eliminate the bias they may introduce into the consensus mean calculations, an iterative procedure of recursive testing of outlying observations was used. To illustrate the process of identifying outlying observations and calculating consensus means, the normalized difference of the observations from their respective consensus means are plotted in Fig. 1
. Of the 20 pools for which consensus means were calculated, 8 pools had all observations fall within the 99% probability limits with the initial mean calculation, and 7 of the remaining 12 pools required only 1 iteration of the consensus means calculation procedure. Of the 10 participating laboratories, 5 using HPLC methods (H1H3, H6, and H8) and 1 using an enzymatic assay (E1) reported results that satisfied the 99% probability inclusion criteria for the consensus mean calculations of all serum pools (Table S1 of the online Data Supplement). A total of 25 observations from the remaining 4 laboratories were rejected over the course of the consensus mean calculations; 16 observations were rejected in the first iteration, 6 in the second, and 3 in the third (Fig. 1
). After 3 iterations of the consensus mean calculation procedure, only pool 2 had 1 remaining outlying observation based on 99% probability limits. PLP consensus means for the 20 serum pools ranged from 15.8 to 319 nmol/L (mean, 84.2 nmol/L; median, 55.2 nmol/L; Table S1 of the online Data Supplement). Between-laboratory variability (CV) within each consensus mean ranged from 12% to 37% (mean, 21%; median, 21%) after exclusion of outlying values.
|
bias
Mean absolute and relative measurement biases among participants ranged from 7.7 nmol/L to 18.6 nmol/L and from 10.0% to 30.6%, respectively (Table 5
). Relative bias was also evaluated against performance criteria related to biological variation(23). When applied to the mean relative bias, all 10 laboratories satisfied the requirements of minimum performance, with 8 of the 10 also meeting the requirements of desirable performance and 6 of the 10 achieving optimum performance (Table 5
). Use of the mean relative bias in this case can be somewhat misleading, however, because Eq. 4
does not take into consideration the error associated with the mean relative bias. As a consequence, the mean relative bias of a method may satisfy a certain performance criterion although many of the individual observations made with the method may not. A more meaningful way to assess performance bias is to look at the relative bias of each individual observation and to determine the proportion of results from a laboratory that satisfy the performance criteria. These results appear in Table 5
as cumulative percentages of the observations from each laboratory that were deemed to meet the performance criteria stated above. Laboratories H2, H6, and H8 had 90% or more of their results satisfy optimum performance criteria, and 100% of the results from these 3 laboratories satisfied desirable performance criteria. All observations from H1 and H3, as well as the above-mentioned laboratories, met minimum performance criteria for bias. For laboratory E1, all but 2 observations also satisfied the minimum performance criteria (98%). The 4 remaining laboratories (H4, H5, H7, and E2) had results that failed to meet the minimum bias requirement, with the proportion of results failing to achieve the minimum varying from 25% to 53%.
|
| Discussion |
|---|
|
|
|---|
Imprecision varied greatly among laboratories. Within-day CVs of 0.4%37% and between-day CVs of 1.4%26% were observed among participants. Cases were also observed in which the mean within-day CV for a laboratory apparently exceeded its mean between-day CV; these likely were artifacts arising from the relatively small number of pools used to assess within-day imprecision (n = 3) compared with between-day imprecision (n = 20). In comparison, data from several published methods for measuring PLP in human plasma or serum suggest that within- and between-day CVs are typically 0.5%5.9% and 3.6%11.8%, respectively, for HPLC methods(2)(7)(8)(9)(13)(14) and 3.0%11% and 7.9%13%, respectively, for enzymatic methods(3)(4)(5). It should be noted, however, that the number of within- or between-day measurements used for determining CVs in these studies was considerably greater (typically 1020 per specimen in each case) than the number of measurements used in the present study. The 4 HPLC laboratories that achieved optimum imprecision (H1, H2, H6, and H8) had mean within-day CVs (0.6%3.1%) and between-day CVs (1.4%5.6%) that were consistent with the lower ends of the HPLC ranges found in the literature. Laboratory H3, which demonstrated desirable imprecision, had a within-day CV that was also consistent with the aforementioned laboratories (2.0%), but its between-day CV was toward the high end of the HPLC range (12%). Although the remaining laboratories using HPLC-based and enzymatic assays achieved either desirable (H7, E1, and E2) or minimum (H4 and H5) imprecision, their mean within- and between-day CVs exceeded the expected imprecision based on methods in the literature, suggesting that quality control issues may exist in these laboratories.
Serum pool mixing recoveries were generally quantitative and consistent among laboratories. The mean between-laboratory serum pool mixing recovery was 99.1%, and 5 HPLC laboratories (H1H3, H6, and H8) had serum pool mixing recoveries within 3% of being quantitative (100%). Laboratories whose mixing recoveries appeared to deviate significantly from 100% (H4, E1, and E2) suggest that these assays may be subject to concomitant interferences, particularly if the confidence interval associated with the recovery is relatively small.
Recoveries in standard addition experiments were more variable than in mixing recoveries and tended to be <100%. The mean between-laboratory recovery was 91.2% when an effective PLP concentration of 32.1 nmol/L was added to a serum pool with an endogenous PLP concentration of
49 nmol/L. The mean between-laboratory recovery improved to 97.6% when the amount of PLP added was increased to 62.4 nmol/L. These recoveries seem to be consistent with the literature; a selection of published methods for PLP suggest that analyte addition recoveries of 88%103% (between-method mean of 95%) are typical for HPLC methods(2)(7)(9)(10)(11)(12)(13)(25) and recoveries of 83%125% (between-method mean of 99%) can be expected from enzymatic methods(3)(4)(5). Because the recovery calculation was based on a gravimetrically determined amount of PLP and the actual amount of the PLP addition was not measured in each laboratory, actual recoveries in each laboratory may vary slightly depending on calibration accuracy. Results from the analysis of the aqueous specimen seem to support this (Table S3 of the online Data Supplement).
The 4 laboratories that met optimum performance criteria for measurement imprecision (H1, H2, H6, and H8) also met desirable bias performance criteria for a high proportion (100% for H2, H6, and H8; 97% for H1) of their PLP measurements. Three of these 4 laboratories also met optimum performance criteria for a high proportion of measurements (90% H2 and H6; 95% for H8). Information on method bias in the literature is sparse, however, making comparisons with other studies difficult. Bias in PLP measurements is most frequently evaluated through the direct comparison of 2 methods, typically as part of a method validation. Consequentially, the bias of a method is usually expressed relative to measurements made with another method, as opposed to more objective benchmarks such as reference values or consensus means. Although no PLP reference value is given for the plasma specimen used in the interlaboratory comparison study conducted by Reynolds(15), the extended high range observed for the HPLC values (481700 nmol/L) compared with those obtained with enzymatic assays (5365 nmol/L) implies that PLP measurements made by HPLC in the study may be inherently biased high. This appears not to be the case, however, for the HPLC laboratories in the present study that satisfy optimum performance criteria for bias. Slightly negative mean absolute (6.1 nmol/L to 0.8 nmol/L; between-method mean of 3.4 nmol/L) and relative (6.1% to 0.9%; between-method mean of 4.8%) biases were observed for these 3 laboratories.
Four HPLC laboratories (H1, H2, H6, and H8) were found to have very similar characteristics in terms of measurement imprecision and bias. This is interesting from the standpoint that, although these 4 methods are based on common instrumental principles, they are markedly different in terms of sample preparation, chromatographic separation schemes, and in particular, PLP derivatization approaches. This seems to suggest that the derivatization approaches used by these laboratories [bisulfite(11), chlorite(7), and semicarbazide(13)(14)] to increase the fluorescence sensitivity of PLP were all equally effective for quantifying PLP in this study and that quality control within the laboratory is the single most important issue. Although a cursory review of the mean bias and imprecision statistics for laboratory H3 suggests that its overall performance was inferior to the performance of the above-mentioned laboratories, closer inspection of these data suggests otherwise. The results for days 2 and 3 from laboratory H3 correlated extremely well within each specimen pool, whereas the results from day 1 were consistently biased low relative to days 2 and 3, suggesting calibration inconsistencies between days. Interestingly, the results from day 1 for laboratory H3 appear to be consistent with those reported for laboratories H1, H2, H6, and H8 [mean (SD) absolute bias of 2.7 (3.8) nmol/L, mean relative bias of 4.6 (4.7)%], whereas the results from days 2 and 3 appear to be biased high. The HPLC laboratories that demonstrated relatively high degrees of imprecision and bias in their assays (H4, H5, and H7) may have issues in terms of quality control and, possibly, nonspecific analyte interferences.
Imprecision in the laboratories using enzymatic methods was generally higher than that observed in the HPLC laboratories, a difference that is most likely a result of fundamental differences in analytical methodologies. All results from laboratory E1 met the inclusion criteria for the consensus mean calculations and the minimum performance standard for relative bias. This was not the case for laboratory E2, although their mean between-day imprecision was quite similar to that observed for laboratory E1. The presence of extreme outlying observations with good reproducibility (Table S1 of the online Data Supplement) suggests that concomitant analyte interferences may exist in the assay used by laboratory E2.
The overall results of this study suggest that quality control and calibration accuracy are the 2 largest issues that laboratories performing vitamin B6 assays face. Regular studies such as this, or the establishment of an external quality assurance program with a wide participation base, would greatly assist laboratories in identifying and resolving issues with their methods. The development of standard reference materials for vitamin B6 would also be a more objective means of assessing method bias. The development and validation of true reference methods for vitamin B6 that are capable of confirming the identities of analytes, such as isotope-dilution liquid chromatographytandem mass spectrometry, should also be pursued.
| Acknowledgments |
|---|
| References |
|---|
|
|
|---|
The following articles in journals at HighWire Press have cited this article:
![]() |
E. T Aasheim, D. Hofso, J. Hjelmesaeth, K. I Birkeland, and T. Bohmer Vitamin status in morbidly obese patients: a cross-sectional study Am. J. Clinical Nutrition, February 1, 2008; 87(2): 362 - 369. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |