|
|
||||||||
Articles |
UK Prospective Diabetes Study (UKPDS) Group, Diabetes Research Laboratories, Radcliffe Infirmary, Woodstock Road, Oxford OX2 6HE, UK.
1
Department of Clinical Biochemistry, Medical School Buildings, Aberdeen Royal Infirmary, Foresterhill, Aberdeen, AB9 2ZB, UK.
a Author for correspondence. Fax 00-44-1865-723884; e-mail semanley{at}drl.ox.ac.uk
| Abstract |
|---|
|
|
|---|
| Introduction |
|---|
|
|
|---|
The procedures we have developed are illustrated with data for glycohemoglobin and plasma triglyceride obtained over 15 years from the UK Prospective Diabetes Study (UKPDS)1 (2), a randomized intervention trial designed to investigate whether intensive as opposed to conventional therapy of glycemic control can reduce morbidity and mortality in patients with type 2 diabetes. This trial was passed by the Central Oxford Research Ethics Committee and fulfills the criteria of the Helsinki Declaration, 1975 and 1983.
| Methods |
|---|
|
|
|---|
Biochemistry methodology for 15 UKPDS analytes has previously been described. Glycohemoglobin was measured as hemoglobin A1c (Hb A1c) from 1978 to 1984 by isoelectric focusing (IEF), as Hb A1 from 1984 to 1989 by electroendosmosis (4) (EEO), and as Hb A1c from 1989 until 1996 by HPLC with a Bio-Rad Diamat Automated Glycosylated Hemoglobin Analyzer (Bio-Rad Laboratories) (3). Fasting triglyceride was measured between 1980 and 1985 with an enzymatic UV kit on a Pye UNICAM AURA spectrophotometer (AT1 Unicam) and from 1988 until the present with an enzymatic colorimetric kit (GPO-PAP, Boehringer Mannheim) on a Cobas FARA analyzer (Roche Diagnostica) with no correction for free glycerol.
strategy for evaluation of biochemical data
Accepted internal laboratory quality-control and external
quality-assurance procedures have been used throughout the UKPDS trial.
To maintain comparability of data when improved assays were introduced,
formal laboratory comparisons of biochemical methods were undertaken,
and statistical overview techniques were used to detect unforeseen
shifts (Table 1
).
|
qc and quality assurance (qa)
For each laboratory assay run, commercially available QC sera were
measured at low, medium, and high concentrations. From 1986, these
results have been entered into an in-house computer program (QSTAT)
that determined whether the results were acceptable by modified
Westgard rules (5). These rules used the deviation of
sequential QC sera results from the mean of 30 run-in measurements for
each QC concentration. A single QC value ± 3 SD or two QC
values ± 2 SD from the mean indicated that reassay was necessary
(Fig. 1
).
|
The central laboratory participated in appropriate external QA schemes. These were available for most analytes in the later stages of the study. Reports from these schemes were inspected to compare the performance of the methods used by the UKPDS laboratory with those of other laboratories and with available reference methods.
introduction of improved analytical methods
Analytical methods were updated during the study as improved
technologies became available. Formal laboratory comparisons of any new
method with the previous one involved the assay of at least 200 samples
in parallel on several days and over a representative range of values.
Descriptive statistics were prepared from the data for each method, and
the results for different methods were compared with appropriate
statistical tests, including paired t-tests, MannWhitney
U-test, or Wilcoxon signed rank test for differences in
central tendency and KolmogorovSmirnov test for differences between
distributions. Scattergrams (Fig. 2
A) and difference plots (Fig. 2B
), as outlined by Bland and
Altman (6), were inspected to identify differences between
methods and to determine whether any differences were related to an
offset or to the concentration of the analyte. An appropriate equation
relating the two methods was calculated: e.g., a linear or quadratic
regression model with a logarithmic or square root transformation of
the data if required. For further statistical analyses this equation
was used to realign the previous data to the current method (Fig. 3
).
|
|
maintaining comparability of data with a reference population
Representative data from a suitable population can be used to
confirm the comparability of measurements over time. This approach
assumes that the distribution of biochemical variables measured in
random samples drawn at different times from a large representative
population will not vary substantially, although the population
demographics may nevertheless vary with time (e.g., the change in
population cholesterol that occurred during the MRFIT trial). The
reference population we used consisted of the newly diagnosed patients
entering the study each year. To recruit a control population alongside
the study population, even though this would have financial, workload,
and recruitment implications, is now more desirable. To ensure that
there were no major potentially confounding changes in population
characteristics, such as body weight, the relevant clinical and
biometric data were analyzed. Data from this population for each
analytical method used during the study were compared with data from
the current analytical method, to determine whether there were any
systematic differences between a previous assay method and the current
method. Box-and-whisker plots (Fig. 2
C) showing the median
and interquartile range (box), and 10th and 90th centiles (whiskers)
for each analytical method were examined and compared with the median
and 95% confidence interval of the current method. Appropriate
statistical tests were applied to compare each method with the current
method to determine whether there were significant differences between
the methods.
In large trials, it is necessary to decide whether small but
statistically significant differences are of clinical importance
(7). For Hb A1c and triglycerides we have used
the criteria that it is not necessary to realign differences with
statistical significance (P >0.01) or differences between
the medians of <5% (Fig. 4
). Such criteria are based on differences between normal and
pathological populations; assay performance can also be used to assist
in deciding whether it is necessary to realign data from structured
laboratory comparisons of analytical methods.
|
In short-term trials, it may be sufficient to use Cusum charts
(8) (a plot against time of the daily cumulative sum of
the difference between the measurement and the established mean,
keeping the sign of the difference) to detect changes across time
within an analytical method. The quangle, or quality-control angle,
chart (9) is more suitable for looking at data over a
longer time period (Fig. 5
). For a series X1...
Xn with T the target and neutral value,
the quangle after r steps can be calculated by taking a as
the length of each segment and
as the angle corresponding to 1 unit
(the angular scale) such that
![]() |
![]() | (1) |
is
chosen to be equal to b/a [9].
|
Changes in the direction of the line (inflexion points) on a quangle plot indicate a change in the mean value of accumulating data. If the mean continually increases, the quangle will continue to change direction and is independent of the choice of a target value, and thus it is easier to separate time periods of different behavior than is the use of Cusum methods. The shape of the quangle plot is a useful visual aid where more than one change has occurred over the time period. A quangle plot was prepared for each variable, and the dates of changes of analytical methods were marked on the plot. Laboratory records of QC data on either side of any additional obvious inflexion points were checked for discontinuities. When an unexpected inflexion point was confirmed by a simultaneous change in QC data, the data were divided into two groups at this point and considered as different analytical methods for purposes of adjustment.
| Results |
|---|
|
|
|---|
introduction of improved analytical methods
Measurements of glycohemoglobin by the EEO method (Hb
A1) were compared with measurements of the same
samples by the HPLC method on the Bio-Rad Diamat analyzer (Hb
A1c) in 296 samples across a range of samples from
apparently healthy subjects and diabetic patients (Fig. 2
, A and B).
This showed that it was necessary to realign the previous EEO data to
the HPLC by a linear regression equation:
![]() | (2) |
maintaining comparability of clinical and biochemical data
Data were obtained from newly diagnosed patients at the
recruitment and randomization visits between 1977 and 1991. Weight,
height, age, and initial fasting plasma glucose were inspected and
showed no systematic changes over time. Data from these patients were
then used to confirm that other longitudinal data remained comparable.
Data from this reference population for the three different assay
methods for measuring glycohemoglobin (IEF, EEO, and HPLC) are
illustrated in Fig. 2C
with the shaded areas showing data before
realignment to the HPLC method [the median (95% confidence interval)
of the HPLC method is indicated for reference]. These data confirmed
that the formulas obtained from the direct laboratory comparisons were
robust.
A quangle plot of the plasma triglyceride data identified a
discontinuity during 1982 (Fig. 5
). Laboratory QC data confirmed a step
change in all three concentrations of QC during 1982 that had
previously been undetected by visual inspection of plots of consecutive
QC values (LeveyJennings charts). The data from the reference
population were divided at this inflexion point, and an adjustment was
made as if they were different analytical methods. In total, five
inflexion points were identified on the quangle plots for the 15 UKPDS
analytes, only 2 of which were confirmed by QC records. If the quangle
point was confirmed by QC records, a decision was made to realign the
data for these two analytes by the predetermined rules for statistical
significance vs clinical difference. The relevant mathematical
transform was calculated, and a graph of centiles for adjusted data was
plotted against those for the current method to check that the
transformation was valid across the measured range (Fig. 3
).
| Discussion |
|---|
|
|
|---|
In clinical trials, analytical methods need to be updated when improved techniques become available. The alternative would be to constrain a trial to obsolete technology, which would ultimately render the data irrelevant to modern practice and the results unpublishable. Adherence to older methods is not an option because the accuracy and precision of laboratory techniques used in the 1970s are not acceptable in the 1990s. For example, measurement of glycohemoglobin has changed from Hb A1 to Hb A1c with a concomitant reduction in the interassay CV from 10% to 2% and different normal ranges for the analytes (10). Storage of some samples at -70 °C may allow retrospective comparisons, but with large trials not all samples could be reassayed years later. Direct laboratory comparison of samples across the appropriate range is essential when laboratory methods are changed. For the UKPDS, the statistical evaluation procedures described here verified that these comparisons were reliable.
Internal QC and external QA schemes in the laboratory are necessary to monitor the performance of an analytical method. However, currently available external QA schemes do not provide adequate long-term monitoring. QA schemes report the mean ± SD of results from all participating laboratories and also more-specific method means. This norm-referencing is not necessarily stable in the long term as laboratories join and leave the schemes or change their methodologies, so the reported norms and sizes of groups for each method will vary. Existing QA schemes rely on lyophilized nonhuman sera, which may not perform the same as human samples when assayed. However, new external QA schemes (11) aim to compare individual laboratory performance with reference methods for native human blood or serum. These schemes, which are being organized internationally, should allow results from different laboratories and methods to be compared. Networks of reference laboratories are being set up in Europe with guidelines from external assessment to achieve an accuracy-based uniform measurement scheme with traceability to the true value, i.e., the European Reference System for the Medical Laboratory (12).
A primary reference material, which is a mixture of pure Hb A1c and Hb A0, along with a reference method that specifically measures Hb A1c is being developed by an IFCC working group on calibration of Hb A1c(13). This reference system will in time be used for the primary calibration of routine tests; a common calibrator for HbA1c has been shown to reduce interlaboratory variation markedly (14). While this system is being developed, the method from the Diabetes Control and Complications Trial (15) will be used as a reference method for international standardization and comparison of results from different methods.
Assurance is needed that measurements carried out 10 years ago are comparable with those carried out today. In the absence of a recognized reference method, one solution is the use of data from a reference population. These data can be taken from measurements in an external population of apparently healthy subjects, or, as described in this paper, an internal population where the distributions of the biochemical characteristics of a large population at entry to the study were assumed to be stable when anthropometric measurements, such as body weight and height, had not changed. One should be aware that this is nevertheless no guarantee of complete stability: Populations change their characteristics with time and age (16). However, a judiciously chosen population will allow comparisons between the study and background demographic change. This procedure allows a longitudinal check that values determined in the early years of the study are not significantly different from those in subsequent years and confirms that no previously undetected anomalies exist in the data.
The quangle plot provides another method for monitoring changes in data, which is probably more useful over long periods than the more commonly used Cusum (8). When examining longitudinal data in a reference population, the quangle can be used to assess comparability of data, and this technique is applicable to either internal or external reference populations. Only two possible discontinuities in the UKPDS data identified by the quangle plot were confirmed by reanalysis of previous QC data. Both occurred early in the study when laboratory QC involved only visual inspection of data (before the advent of computerized Westgard rules) and it is possible that the change in triglyceride measurement in 1982 was caused by a change of reagents.
As the number of samples in a comparison increases, the power of statistical techniques to identify smaller and smaller conventionally statistically significant differences increases. In a large study differences can be identified as statistically significant although they may not be of biological or clinical relevance (17). It is therefore necessary to determine whether any discontinuities detected by quangle from large data sets and confirmed by QC records are of clinical importance to the study. The decision to realign the data should be based on the biological and analytical characteristics of the variable and the nature of the clinical study. This interpretation of clinical laboratory data has already been suggested as the clinical usefulness approach by Petersen et al. (18) along with the assessment of analytical performance and biological within- and between-subject variation.
That laboratory data from clinical trials can be compared with data from methods used in other laboratories is important. Steps should be taken to use the procedures described in this paper for comparison with national and international standards where available. The main comparative outcomes of long-term trials, such as the UKPDS, are protected from systematic bias by an appropriate experimental design and random allocation to different therapies. However, failure to ensure comparability of data from different analytical methods may mean that substantial and important longitudinal descriptive results remain undetected or that trends over time are not correctly identified. Trial reports should therefore specify what procedures have been used to ensure that clinical and biochemical measurements made over extended periods of time are comparable and, in future, should detail comparisons of laboratory methods with international references.
| Acknowledgments |
|---|
| Footnotes |
|---|
| References |
|---|
|
|
|---|
The following articles in journals at HighWire Press have cited this article:
![]() |
J. Hartweg, M. Gunter, R. Perera, A. Farmer, C. Cull, C. Schalkwijk, A. Kok, H. Twaalfhoven, R. Holman, and A. Neil Stability of Soluble Adhesion Molecules, Selectins, and C-Reactive Protein at Various Temperatures: Implications for Epidemiological and Large-Scale Clinical Studies Clin. Chem., October 1, 2007; 53(10): 1858 - 1860. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. E. Manley, I. M. Stratton, P. M. Clark, and S. D. Luzio Comparison of 11 Human Insulin Assays: Implications for Clinical Investigation and Research Clin. Chem., May 1, 2007; 53(5): 922 - 932. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Retnakaran, C. A. Cull, K. I. Thorne, A. I. Adler, R. R. Holman, and for the UKPDS Study Group Risk Factors for Renal Dysfunction in Type 2 Diabetes: U.K. Prospective Diabetes Study 74 Diabetes, June 1, 2006; 55(6): 1832 - 1839. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Steffes, P. Cleary, D. Goldstein, R. Little, H.-M. Wiedmeyer, C. Rohlfing, J. England, J. Bucksa, M. Nowicki, and the DCCT/EDIC Research Group Hemoglobin A1c Measurements over Nearly Two Decades: Sustaining Comparable Values throughout the Diabetes Control and Complications Trial and the Epidemiology of Diabetes Interventions and Complications Study Clin. Chem., April 1, 2005; 51(4): 753 - 758. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Colagiuri, C. A. Cull, and R. R. Holman Are Lower Fasting Plasma Glucose Levels at Diagnosis of Type 2 Diabetes Associated With Improved Outcomes?: U.K. Prospective Diabetes Study 61 Diabetes Care, August 1, 2002; 25(8): 1410 - 1417. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. M Stratton, A. I Adler, H A. W Neil, D. R Matthews, S. E Manley, C. A Cull, D. Hadden, R. C Turner, and R. R Holman Association of glycaemia with macrovascular and microvascular complications of type 2 diabetes (UKPDS 35): prospective observational study BMJ, August 12, 2000; 321(7258): 405 - 412. [Abstract] [Full Text] |
||||
![]() |
R. C. Turner, C. A. Cull, V. Frighi, R. R. Holman, and for the UK Prospective Diabetes Study Group Glycemic Control With Diet, Sulfonylurea, Metformin, or Insulin in Patients With Type 2 Diabetes Mellitus: Progressive Requirement for Multiple Therapies (UKPDS 49) JAMA, June 2, 1999; 281(21): 2005 - 2012. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |