|
|
||||||||
Technical Briefs |
1 Department of Mathematical Sciences, University of Cincinnati, Cincinnati, OH 45221-0025,
2 Department of Pathology and Laboratory Medicine, University of Cincinnati, Cincinnati, OH 45267-0714
aauthor for correspondence: fax 513-556-3417, e-mail paul.horn{at}uc.edu
The clinical chemist desiring to set up appropriate reference intervals must consider race or ethnicity as a possible factor on which to subdivide an interval. The effect of ethnicity on various analyte values has not been systematically studied (1)(2). The advent of National Health and Nutrition Examination Survey (NHANES) data (3) has made it possible to determine the effect of ethnicity on reference interval estimation and whether separate reference intervals should be derived for each ethnic group.
The Third National Health and Nutrition Examination Survey (NHANES III) contains data for 33 994 persons ages 2 months and older who participated in the survey. Clinical chemistry measurements were made on several analytes, and health status was determined by a physician. In this report, we will use only individuals with the "Excellent"rating of health status as recommended by the IFCC (4) and NCCLS (5).
The issue of whether to use separate reference intervals has been addressed by Harris and Boyd (6). Their recommendation, which is used by the NCCLS (5), is that two groups should be combined into a single group unless their means and/or SDs exceed appropriate predetermined thresholds. By treating the known healthy individuals as a "gold standard", the effect of the addition of ethnicity on these estimators will be examined. We used a previously described outlier detection scheme (7) to reduce the effect of atypical observations on these estimators.
The NHANES population was obtained from NHANES 3x Ver. 1.21 (3). The data are the result of a complex survey design involving stratification and clustering that yielded individual weights for each observation. However, for our purposes in this report, we will treat the individuals as coming from a random sample, i.e., individual weights will be ignored. Analyses involving the individual weights will be explored in future work.
We used the variable DMARETHN (Race-ethnicity) to analyze the following three groups: non-Hispanic white (NHW), non-Hispanic black (NHB), and Mexican-American (MA). The variable HSSEX was used to define sex and HSAGEIR to define age at interview. The health status was derived from the variable DMAPEP 13A (physicians impression). Any individual with a missing value in any of these variables was dropped from the analyses. Our analyses were based on individuals 20 years of age and older. The sample sizes of the groups were as follows: for males, 1032, 734, and 1000 for NHW, NHB, and MA, respectively; for females, 1279, 842, and 909, for NHW, NHB, and MA, respectively. The following 33 analytes were examined: albumin, alanine aminotransferase (ALT), alkaline phosphatase (AP), aspartate aminotransferase (AST), blood urea nitrogen (BUN), calcium, chloride, creatinine (CR), urinary creatinine (CR_URIN),
-glutamyltransferase (GAMMAGT), glucose, granulocytes (GRAN), hematocrit (HEMOCRIT), hemoglobin (HEMOGLOB), potassium (K), lactate dehydrogenase (LDH), lymphocytes (LYMPH), mean corpuscular hemoglobin (MCH), mean corpuscular hemoglobin concentration (MCHCON), mean corpuscular volume (MCV), monocytes (MONO), mean platelet volume (MPV), serum osmolality (OSMO), platelet count (PLATELET), phosphate (PO4), red blood cell count (RBC), red blood cell distribution width (RBCDW), sodium, total bilirubin (TBILIRUB), total serum carbon dioxide (TCO2), total serum protein (TPROTEIN), uric acid (URIC), and white blood cell count (WBC).
Outliers in the NHANES III data set were removed using the method described by Horn et al. (7). Briefly, this method first transforms the data by use of the BoxCox power transformation, (x
- 1)/
, where x is an original observation. The outlier cutoffs are then computed for the transformed data by use of Tukey inner fences. If the transformed data come from a gaussian population, then this technique will identify
0.7% of the data as outliers (7). The parameter
was also examined because its value indicates the nature of the underlying distribution of the analyte. For example, if
= 0, then the natural logarithm was used as the transformation, and therefore the original data were log-normally distributed. Clearly, if
= 1, then the data are left untransformed. Outlier detection in conjunction with either a traditional nonparametric or robust estimator of the reference interval was computed using a described previously algorithm (7)(8)(9).
The Harris and Boyd (6) method was used to determine whether to combine samples in the computation of a reference interval. This method uses the following test statistic (5):
![]() |
1 and
2 are the sample means of the two subgroups, s12 and s22 are the sample variances, and n1 and n2 are the sample sizes in each subclass, respectively. If there are at least 60 individuals in each group, the z-test above is essentially a nonparametric test (because of the Central Limit Theorem) and may be applied to the original data regardless of whether the values conform to a gaussian distribution. The groups are not combined if the test statistic, z, is too large, or equivalently, if z* =z/[(n1 +n2)/240]1/2 is greater than k. Suggested values for k are 3 or 5 (10). [For gaussian populations, it is also suggested that groups not be combined if the maximum (s1, s2)/|s2 - s1| is <3, i.e., if the SDs are significantly different (5).]
Shown in Table 1
of the data supplement (available with the online version of this Technical Brief at www.clinchem.org/content/vol48/issue10/) are the 95% reference intervals for males and females for the 33 analytes separately for each of the three ethnic groups as well as the combined group. Note that the robust and nonparametric reference intervals are, in most cases, very similar. When there are differences, the robust upper endpoint is smaller than that of the nonparametric. The values of
for the BoxCoxTukey outlier detection, as described by Horn et al. (7), and c for the case
= 0 are also given. The different values of
are noteworthy and reflect the varied distributions of analytical values. That is, one cannot simply look at the data as normal (
= 1) or log-normal (
= 0) in general. Although these underlying distributions appear nongaussian, we use the Harris and Boyd test for separation as an approximate guide.
|
The z* values by analyte for the Harris and Boyd separation test for pairs of ethnic groups are listed in Table 1
. One problem of testing pairs of groups is that inconsistencies may arise as the result of intransitivity: the test may suggest that NHW and NHB be partitioned but that MA be combined with both NHW and NHB. Table 1
shows that for males and k = 3, the three groups could be combined for 15 analytes, two of the three combined for 9 analytes, none combined for 1 analyte (creatinine), and that intransitivity occurred for 8 analytes. For k = 5, the numbers of analytes for these four cases were 29, 2, 0, and 2, respectively. For females, the patterns were similar although the analytes were different, especially for k = 3.
We have used our previously described methods of determining reference intervals that use outlier detection with either nonparametric or robust estimators of the endpoints. The final values presented in this study are medically conservative in that they will tend to reduce false negatives. Tables 1
and 2 in the data supplement give reasonable estimates of the reference intervals for the different genders and ethnic groups. From these data, the Harris and Boyd separation criteria (10) indicate that some degree of separation of the ethnic groups is necessary (Table 1
in the text).
This study showed that the BoxCox
parameter could vary among ethnic groups for the same analyte. However, in the vast majority of cases values were similar, implying that pooling of the data is possible. Values of
from -0.1 to 0.1 were set equal to 0, indicating the appropriateness of a log transformation. This was found to be the case in very few of the analytes. Similarly, there were few cases, <25%, where no transformation (
= 1) was indicated.
In conclusion, taking advantage of the large numbers of patients evaluated in the NHANES III survey and our previously described analytical procedures, we have shown that use of the Boyd and Harris criteria (10) indicates that separate reference intervals are warranted among the three ethnic groups considered. One possible guideline is to separate or combine the groups based on whether z* is >5 and, if intransitivity exists, base the decisions on whether z* is >3.
References
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |