Clinical Chemistry 43: 1610-1617, 1997;
(Clinical Chemistry. 1997;43:1610-1617.)
© 1997 American Association for Clinical Chemistry, Inc.
Application of the Department of Health and Human Services proposed waived status requirements for in vitro diagnostic testing devices: case study
Sharon S. Ehrmeyer1,a and
Ronald H. Laessig2
1
Medical Technology Programs and
2
State Laboratory of Hygiene, Department of Pathology and Laboratory Medicine, University of Wisconsin, Madison, WI 53706.
a Author for correspondence. Fax 608-262-9520; e-mail ehrmeyer{at}facstaff.wisc.edu
 |
Abstract
|
|---|
The CLIA'88 classified all clinical laboratory testing as waived,
moderate, or high complexity. The eight original waived tests were
characterized as simple, accurate, error-free, risk-free, and suitable
for home use by nonlaboratory professionals. The subjective nature of
the classification process was challenged immediately. The Clinical
Laboratory Improvement Advisory Committee asked the CDC and the Health
Care Financing Administration to develop objective criteria that
included assessment of performance by field-test and in-house data. We
examined the efficacy of the CDC protocol with empirical data from the
HemoCue B-Hemoglobin Test System® submission, to assess operator
competency, intra-/interoperator and between-site imprecision, and
accuracy. Non-laboratory-trained operators demonstrated 23%
imprecision (40200 g/L). Accuracy studies yielded a slope of 1.01, an
intercept of 3.53 g/L, and r of 1.00 (52230 g/L). Results
met the protocol's Tonks' criterion for imprecision (less than
one-fourth of the reference range).
 |
Introduction
|
|---|
US Public Law 100-578 of 1988 (1) (CLIA'88)
resulted in regulations that categorized all test procedures on the
basis of their complexity (2). The legislation exempted
from extensive regulation certain simple procedures, often performed by
nonlaboratory professionals and (or) patients in a home healthcare
setting. Because most of the regulations were waived for these
procedures, the term "waived tests" became part of laboratory
jargon. Through the early drafts of the CLIA regulations, the concept
of waived testing underwent several iterations. The final version of
CLIA'88, published in the February 28, 1992, Federal Register, placed
eight general types of test devices or analytes, including some glucose
and hemoglobin methodologies, into the waived category.
Before implementing CLIA'88, the CDC used a point-scoring system to
classify each in vitro diagnostic test procedure or device into one of
three categories. Approximately 2500 procedures were classified as high
complexity, 7500 as moderate complexity, and 8 as waived. The eight
tests classified initially as waived included those available
over-the-counter (home use), suitable for use by non-laboratory-trained
individuals, and those in which erroneous results posed no risk of
serious harm to the patient. These tests essentially met the initial
definition of waived, meaning free from regulation under CLIA'88. The
waived category, both for the inclusions and exclusions, created
immediate controversy and occupied the discussions of the Clinical
Laboratory Improvement Advisory Committee almost from its initial
meeting (3). Much concern centered on these tests being
used in some point-of-care situations where the exemption from
regulation could be perceived as a substantial benefit. The Clinical
Laboratory Improvement Advisory Committee was concerned with the
subjective nature of the test classification criteria and the
possibility that some tests on the list that were not error-free and
that incorrect results would be truly harmful.
Pressure to place tests in the waived category also came from
physicians in office settings, allied health professionals,
health-conscious patients wishing to take responsibility for at least
part of their healthcare, and manufacturers with vested interests in
specific products. The Health Care Financing Administration
(HCFA)1
and CDC also expressed interest in expanding the waived
category to retain CLIA oversight for all clinical testing
as various members of Congress attempted to repeal CLIA because of its
perceived burdensome requirements, particularly for physician
laboratories (4). In 1993, the CDC distributed revised
draft criteria to be used by manufacturers to demonstrate a product's
suitability for waived classification (5). We and others
helped assess the protocol's viability by submitting to the CDC data
that demonstrated a specific product's compliance. On the basis of
this input and other data, the Department of Health and Human Services
published "Categorization of Waived Tests," CDC's proposed
guidelines for requesting waived status, in the September 13, 1995,
Federal Register (6). This study reports our assessment of
the efficacy of using the CDC protocol to elicit appropriate
information to request waived status. We base the assessment on our use
of the process and the actual data generated, analyzed, and submitted
for approval of the device as a waived device.
 |
Materials and Methods
|
|---|
cdc protocol
Table 1
summarizes key qualitative characteristics for devices to be
placed in the waived category. The CDC protocol further requires
manufacturers to provide operating directions for individuals with
reading and comprehension skills comparable with students in the
seventh grade (ages 1214 years)typical of most newspapers and
magazines prepared for general circulation. In addition, Table 2
summarizes quantitative, field-generated data required from at
least three independent test groups, in nonlaboratory settings, to
demonstrate overall performance, as well as performance between and
among the testing sites. Testers are to rely only on the
manufacturer's written directions, not direct personal coaching.
The protocol's first quantitative requirement is the submission of
data to demonstrate within-site, between-site, and total imprecision in
medically significant concentrations. For the field tests, HemoCue
chose to use 9 sites of 1124 testers each and had each tester analyze
four concentrations of liquid control materials in duplicate. As a
consequence of the Occupational Safety and Health Act's blood-borne
pathogens rules (7), no patient samples were tested, and
because no actual patient samples or results were involved,
Institutional Review Board approval was not required.
The protocol requires demonstration of accuracy with reference
materials, patient materials, and patient materials containing
interfering substances. The tests may be carried out under controlled,
in-house conditions by trained laboratory personnel. The studies by
HemoCue included use of surplus hospital samples drawn for routine
hemoglobin analysis.
description of hemocue b-hemoglobin test system®
The candidate device for this study, the HemoCue B-Hemoglobin Test
System, is manufactured in Ängelholm, Sweden, and distributed by
HemoCue Inc., Mission Viejo, CA. The system consists of a hand-held
photometer (battery or AC line voltage) and single-use, disposable
cuvettes. The unmeasured sample is introduced into the cuvette by
capillary action. Once a sample (or control fluid) has been introduced
into the cuvette and inserted into the instrument, the hemoglobin value
is displayed 45 s later.
The intended use of the device is to provide reliable hemoglobin
measurements, independent of operator skill, in sites remote from the
centralized hospital laboratory. These include physician offices;
screening at Women, Infants, and Childrens' Nutritional Programs;
blood donor centers; hospital wards; and, on the basis of a
physician's recommendation, patient self-monitoring.
The device is factory-calibrated. Its performance is verified at
appropriate intervals with a reference cuvette supplied by the
manufacturer. If the reference cuvette value varies by >3.0 g/L from
the target value, the device is considered out-of-tolerance and
unsuitable for use without cleaning and reverification. If cleaning
fails to remedy the problem, the user is instructed to contact the
manufacturer via a toll-free telephone number. At any time, appropriate
conventional liquid controls or calibration verification materials can
be analyzed. While supplemental liquid controls are not required for
waived tests and not recommended for a home-care situation, they can be
incorporated into routine-operation protocols in healthcare settings.
Test and (or) control results, including those from the reference
cuvette, are recorded manually. The Hemoglobin Test System is designed
to be maintenance-free, except for replacing the batteries when needed
and periodically cleaning the cuvette holder. The device performs
internal, electronic self-checks. If a condition is detected that could
compromise the testing process, a code is displayed, the instrument
remains inoperable, and the tester is instructed to call for technical
assistance.
 |
Results
|
|---|
qualitative evaluation criteria
The data in support of the protocol's qualitative criteria in
Table 1
are provided by the manufacturer's submission to the CDC by
addressing each of the points through an explanatory narrative and by
submitting the operator's manual. The HemoCue Reference Guide
(operating instructions) for the B-Hemoglobin Test System was evaluated
by two computerized assessment programs (8) as requiring
grade 8.1 and 6.7 comprehension skills for grammar, sentence structure,
and vocabulary. A group of 12 seventh graders (site 7, Table 4
)
participated in the field-test sites and performed as well as, or
better than, the other eight groups. Each group received the same
written instructions (also included in the CDC submission) and
performed the analysis without further coaching. No trial or repeat
testing was allowed.
outlier rejection
Data from any study, particularly one conducted by
non-laboratory-trained personnel, would be expected to include some
outliers. Although the CDC protocol does not address outliers, we
rejected any data from an operator where one or more results was
missing (three instances) and data reported in the wrong control group
(two instances), which became obvious with inspection. The remaining
data (from 158 operators) were then evaluated by hemoglobin
concentration and field-test site. The mean ± SD were calculated,
and results exceeding 3 SD from the site mean were rejected. Out of
>600 data pairs, only 2 pairs exceeded 3 SD for their site, and these
also were rejected. All raw data, including the rejected data, were
included in the CDC submission.
precision
To quantify performance characteristics, the CDC protocol uses
measurements of imprecision, on the basis of field studies, and
accuracy, on the basis of field and in-house studies. Table 2
lists
requirements for quantitative data to demonstrate within-site,
between-site, and total imprecision at medically significant
(hemoglobin) concentrations. At each site, participants analyzed, in
duplicate, liquid control materials with ~42, ~84, ~198, and
~134 g/L hemoglobin. The protocol does not specify a specific data
evaluation approach but requires only that it be statistically valid.
Dependent on the design of the study, this could include an analysis of
variance. However, the protocol also lends itself to simplified
statistical calculations, on the basis of mean ± SD, as will be
demonstrated.
operator imprecision
The operator imprecision assessment was based on the mean
(xD) ± SD of the difference between
duplicates by site. Table 3
demonstrates that performance characteristics are independent
of site and operator. These data, along with those in Table 6
, are the
accuracy measurements specified by the protocol to demonstrate operator
competency.
within-site imprecision
The within-site imprecision was calculated by determining the
mean ± SD, by site, for each of the four controls (Table 4
). Each operator contributed two results for each sample. The
within-site CVs ranged from 0.76% or [(1.5/195.9)(100)] for the high
control at site 8 to 4.2% for the low-low sample at sites 1 and 4. The
protocol is compatible with the F-test to determine whether
any of the test sites exhibited significantly larger variance. However,
the F-test used with this study's data is potentially
misleading because the large population N value (316 or 318) in the
denominator tends to detect statistically significant, but clinically
irrelevant, small differences in the SDs. Visual inspection of the CVs
at each control indicates no disproportionately large site variances.
between-site imprecision
The protocol requires evaluation of between-site imprecision. In
Table 5
, the mean values from each site were used to calculate the
between-site SD and CVs (n = 9) for each of the four samples. The
between-site CVs were 0.81.3%. The 95% confidence intervals (CI)
about each mean exhibit large overlap. Only 2 of the 36 site-mean
values exceed the 95% CI, and none exceed the 99% CI. The
t-test of the means seems to be the logical method to test
for site-induced bias. Although the differences in the site-mean values
in some cases (e.g., 43.1 vs 41.9) may be statistically significant,
they are clinically meaningless. The protocol addresses this issue by
allowing discretion as to how the data are submitted.
total imprecision
The final phase of the protocol's field-test requirements focuses
on total imprecision. These data are shown in Table 6
. Not unexpectedly, the grand mean for each of the four
concentrations is essentially the same as observed in Tables 4
and 5
.
However, because each operator ran duplicates (in the order 1, 2, 3, 4,
1, 2, 3, 4), the absence of improved CVs with the second set of the
duplicate assays suggests consistency in performance, or at least that
the device is reproducible in the hands of inexperienced testers. The
grand CVs were 2.861.96% at 44198 g/L hemoglobin.
The field-test data rely on imprecision measurements to demonstrate the
lack of excessive testerexperiencesitedayinstrument influences.
The CDC's protocol includes one quantitative imprecision (i.e.,
performance) specification on the basis of the reference range (RR). It
specifies that the total imprecision, which includes the results of the
precision studies described (between operator, within-site,
between-site), be less than one-fourth the RR. The protocol thus
suggests use of an adaptation of Tonks' formula to evaluate the
magnitude of the imprecision (9). Used in this manner, the
Tonks' approach defines allowable error, expressed as a CV: [(1/4
RR)(1/midpoint RR)(100)] = allowable CV. When values are substituted
for males and an average textbook reference range (130180 g/L) is
used, the CV is 8.1% or [(180130 g/L)(1/4)(1/155 g/L)(100)]. For
females the allowable CV is 8.3%. The largest (worst case) observed
intrasite CV (site 9, Table 4
) is approximately one-half of these
values, and the grand CVs (Table 6
) are considerably less.
accuracy
The CDC protocol implies that the major accuracy studies may be
done in-house by the manufacturer. Ten pools ranging from 52 to 231 g/L
were analyzed 30 times each by HemoCue and by the International
Committee for Standardization in Hematology (ICSH) hemoglobin Reference
Method (Table 7
). The protocol for demonstrating accuracy is open to several
analytical approachest-tests, regression analysis, or
simple inspection. The HemoCue results, by inspection, are not
substantially different from those obtained with the ICSH Reference
Method. The 95% CIs about the respective means show substantial
overlap. In the worst case, the low range, the HemoCue B-Hemoglobin
Test System is 2.03.0 g/L low at values near 54 g/L. Because the
device actually displays results in g/dL, reads only to the nearest
tenth, and truncates (i.e., does not round) the result, this could
account for one-half of the error. By least-squares analysis, HemoCue
vs ICSH, the slope is 1.01, the intercept 3.54 g/L, and the
r (regression coefficient) value is
1.00.2
Table 6
, total imprecision data, collected by non-laboratory-trained
operators in the field studies, also demonstrate accuracy. Because the
four concentrations of control used in the studies had values assigned
by the Reference Method, comparison with the mean values for each
concentration demonstrates accuracy and verifies the validity of the
device's calibration and stability over the study duration.
patient comparisons
The CDC protocol simply requires that the manufacturer
demonstrates accuracy with patient comparisons, but it does not mandate
a specific experimental protocol or particular statistical analysis.
For patient comparisons, the manufacturer selected three patient
groups: A, patients undergoing known prescription drug therapy; B,
hospitalized patients, drug use unknown; and C, blood donors. The data
are summarized in Table 8
. By inspection, comparisons of the mean ± SD for the
candidate and Reference Method and the regression data indicate no
clinically relevant differences for these three populations. The
regression data, group C especially, are very narrowly clustered
between 113 and 177 g/L. In the current study, the linear least-squares
calculation, as reported in the submission, indicated no apparent bias
between methods. However, the protocol does not preclude the use of
other methods, ranging from inspection, to Deming regression, to
BlandAltman analysis. The submitter is free to make the appropriate
choice.
View this table:
[in this window]
[in a new window]
|
Table 8. Accuracy information from in-house studies: comparison of
hemoglobin concentrations measured in three patient
groups.
|
|
interference studies
The CDC protocol requests data on commonly interfering substances.
Interferences were evaluated by analyzing selected samples of fresh,
human whole blood by the HemoCue and ICSH methods. The studies
included: carboxyhemoglobin in supplemented samples and selected
patient samples with leukocytes >250 x 109/L,
or with IgG >15 g/L, and (or) with cholesterol >8.0 mmol/L. In each
case, multiple aliquots were analyzed multiple times. At 50 g/L
carboxyhemoglobin, a bias of 1% was introduced in the HemoCue and
2.7% in the ICSH measurements, when assessed against nonsupplemented
samples. The other interferents introduced the following biases, as
measured with split samples: leukocytes (0.89%), IgG (0.04%), and
cholesterol (0.02%).
 |
Discussion
|
|---|
The CDC protocol consists of three basic parts: (a)
qualitative evaluations of ancillary materials such as directions and
operating procedures to illustrate the ruggedness and ease of use of
the device by non-laboratory-trained persons with seventh grade reading
comprehension, (b) demonstrations of performance in the
field by non-laboratory-trained operators typical of potential users
with quantitative evaluation criteria focused primarily on imprecision
measurements, and (c) evidence of accuracy based on in-house
studies, patient specimens, and skilled operators.
A major CDC qualitative evaluation criterion is: "(the device)
contains fail-safe mechanisms that render no result when the test
system malfunctions and initiate fail-safe mechanisms rendering no test
result when the result is outside the reportable range." In the
submission, the manufacturer addresses this criterion by providing
directions, manuals, and an explanation of the design specifications.
The HemoCue device does not have absolute lock-out capability. The
device, however, displays error messages when malfunctions are
detected. The tester is instructed not to use the HemoCue system until
the situation is corrected. The tester is expected to run the control
cuvette on a daily basis, and if the result is out of numerical
tolerance, take appropriate action that is limited to cleaning the
cuvette holder or calling the manufacturer. Clearly this is not
absolutely fail-safe, in that operators could ignore the error messages
or fail to follow the cleaning procedure. As with the rest of the CDC
protocol, we conclude that the fail-safe criterion is open to
interpretation.
We support the concept of engineering absolutely fail-safe systems, but
feel strongly that some reliance on the tester is an accepted part of
healthcare, including the home-care setting. Moreover, for home use,
aggressive, substantive treatment resulting from an abnormal hemoglobin
value cannot begin without a physician's involvement. When trained
professionals use the device, they are expected to follow the
manufacturer's directions and any additional institutional policies
that may include the analysis of hemoglobin controls.
The CDC's protocol-mandated, field-test studies rely primarily on
imprecision data. This strategy is insightful in that it is focused on
the testing process conducted by nonlaboratory operators. However, the
precision protocol seems to be modeled after those used in the 1970s to
describe the performance of continuous-flow methods, specifying
evaluation of within-run, between-run, between-day (including
between-operator), and total imprecision. By contrast, today's testing
devices are precalibrated and use disposable test cuvettes limited to
one sample. The mandate to demonstrate the ruggedness and ease of use
of the device through multiple sample concentrations, testers, sites,
and duplicate results affords evaluators the opportunity to demonstrate
the range of the device's capabilities. The protocol leaves open the
choice of statistical methods ranging from inspection, mean ± SD,
t-tests, F-tests, and even an analysis of
variance approach. The protocol's total imprecision requirement seems
to suggest combining the CVs from the three types of imprecision
measurements. The worst case would be expected with the lowest control.
The combined imprecision obtained by adding CVs as squares and taking
the square root is 4.8%, well within the suggested tolerance, i.e.,
one-fourth RR or Tonks' performance criterion. The imprecision-based
field studies, focusing on tester, site, and internal duplicates, also
address the critical qualitative areas, i.e., comprehension of
directions, skill required to manipulate the device, and fundamental
reproducibility from site-to-site, tester-to-tester, and over time (a
calibration/accuracy issue).
The protocol's accuracy requirements allow submission of data
generated in-house, under controlled conditions and by skilled
operators. The field-test data (Table 6
) also address calibration
accuracy because the Reference Method values for each pool are known.
Focusing first on the field-test sites, the protocol seems to lead to
two conclusions. Because non-laboratory-trained operators were able to
achieve the correct answers on pooled material, one may infer that the
information provided in the written directions was adequate and that
the intrasite means and CVs demonstrate that patients would be
adequately served in terms of both accuracy and imprecision. Citing the
worst possible case, a patient with a value near the lowest control
(41.9 g/L hemoglobin) would be expected to fall within the range
41.9 ± 2.4 g/L 95% of the time, and 41.9 ± 3.6 g/L 99.7%
of the time. Similar arguments on the basis of clinical relevance could
be made for the other concentrations.
The in-house data document fundamental accuracy, well within clinical
needs, by comparing results from a range of pooled specimens and
patient samples with the Reference Method. These controlled experiments
allow for statistical techniques such as regression analysis to
quantify accuracy. The protocol also mandates use of a version of
Tonks' formula to evaluate accuracy. Tonks originally proposed 1/4 of
the normal range as a rule of thumb for assessing interlaboratory
accuracy. However, the protocol's use of the 1/4 RR criterion, across
all potential analytes and without consideration of clinical relevance
or medical need, especially in the settings where the device is being
used, seems questionable at first glance. In the case of the HemoCue
B-Hemoglobin Test System, the criterion is met and appears to be
clinically relevantat least the criterion is not overly permissive or
restrictive. We feel, however, that future submissions to the CDC
supporting a manufacturer's request for waived status on a particular
analyte should address the issue of medical need and clinical reliance
in the intended-use setting. Dependent on the analyte and the
situation, 1/4 RR may be inappropriately large or small.
We conclude that the CDC protocol describes a feasible
approach that sets a reasonable measure for the industry when data are
submitted to support a request for waived status. We are particularly
cognizant that the protocol allows evaluators considerable latitude in
deciding what data to present, how to interpret them, including which
statistical procedures to use, and what constitutes acceptable
performance. Much of the data incorporated into the FDA's 510(k) and
premarket approval process can be used to develop the submission on the
basis of the CDC protocol.
 |
Acknowledgments
|
|---|
We were paid consultants to HemoCue. Our assignment was to take the
raw data from the field studies and the in-house data submitted to the
FDA under the 510(k) process and frame it in the context of the CDC
protocol, thereby creating a submission that was used by the CDC to
grant waived status.
 |
Footnotes
|
|---|
1 Nonstandard abbreviations: HCFA, Health Care Financing Administration; CI, confidence interval; RR, reference range; ICSH, International Committee for Standardization in Hematology. 
2 We recognize that linear regression may overestimate the intercept and that CIs for slope and intercept are accepted practice. The CDC protocol does not include this requirement, and the submission included only the information given above. Because the slope and intercept variation from the ideal case are not medically significant, the data are adequate for the intended use. 
 |
References
|
|---|
-
Public Law No. 100-578100th Congress, CLIA'1988.
Oct 31, 1988;2903..
-
US Department of Health and Human Services. Medicare,
Medicaid and CLIA programs: regulations implementing the Clinical
Laboratory Improvement Amendments of 1988 (CLIA). Final rule. Fed
Regist 1992;57:7002186..
-
Schwartz M. Report on criteria for waiver, Clinical Laboratory Improvement Advisory Committee 1993:5-8 US Department of Health and Human Services, Aug 12 Washington, DC. .
-
Auxter S. POL exemption debate to begin on Capitol Hill.
Clin Lab News 1995;21:1, 12..
-
Schwartz M. Protocol for requesting waived status, Clinical Laboratory Improvement Advisory Committee 1993:3 US Department of Health and Human Services, Aug 12 Washington, DC. .
-
US Department of Health and Human Services. CLIA
Program: categorization of waived tests. Fed Regist
1995;60:4753443..
-
US Department of Labor, Occupational Safety and Health
Administration. Occupational exposure to blood borne pathogens. Final
rule. Fed Regist 1991;56:6417882..
-
Readability statistics. Microsoft® WORD (version 6. 0).
Redmond, WA: Microsoft Corp..
-
Tonks DB. A study of the accuracy and precision of clinical chemistry determination in 170 Canadian laboratories. Clin Chem 1963;9:217-233.
[Abstract]