|
|
||||||||
Molecular Diagnostics and Genetics |
1 Gene Express, Inc., Toledo, OH.
2 Pfizer Global Research and Development, Ann Arbor, MI.
3 Innovative Analytics, Inc., Kalamazoo, MI.
4 Division of Pulmonary and Critical Care Medicine, Departments of Medicine and Pathology, University of Toledo Health Sciences Campus, Toledo, OH.
5 Asuragen, Austin, TX.
6 Radiant Research, Lincoln, NE.
aAddress correspondence to this author at: Division of Pulmonary and Critical Care Medicine, Department of Medicine, Rm. 0012, Ruppert Health Bldg., University of Toledo Health Sciences Campus, 3000 Arlington Ave., Toledo, OH 43614. Fax 419-383-6244; e-mail: James.Willey2{at}utoledo.edu.
| Abstract |
|---|
|
|
|---|
Methods: For each of 15 healthy volunteers, 6 blood samples were obtained, including 3 samples at each of 2 separate visits. Total variation in TA for each gene was partitioned into replicate, sample, visit, study participant, and residual components.
Results: Variation originating from technical processing was <5% of total combined variation and was primarily preanalytical. Interindividual biological sample variation was larger than technical variation. For 12 of 19 tests, the distribution of measured values was gaussian (ShapiroWilks test).
Conclusion: For control or diseased population groups with variation rates as low as those observed in this control group, 17 individuals per group would be required to detect 1 SD change with 80% power with a 2-sided
= 0.05 statistical test for mean differences.
| Introduction |
|---|
|
|
|---|
Analytical performance characteristics that must be optimized for such technologies have been detailed by the Food and Drug Administration and the CDC (8)(9)(10)(11) and include analytical sensitivity and assay limits, imprecision, analytical specificity (selectivity), interference, and QC. Standardized reverse transcription PCR (StaRT-PCRTM) was developed with the goal of optimizing TA measurement with respect to each of these criteria. Success in this effort was documented in the recent MicroArray Quality Control Consortium (MAQC) project, sponsored by the Food and Drug Administration (12). Specifically, StaRT-PCR assays had an optimal lower detection threshold (<10 molecules/assay) and signal-to-analyte response (100%), high precision (mean CV across all genes was 6%, and for
6000 molecules it was 3.2%), and linear dynamic range (>6 orders of magnitude, the full range of TA in the MAQC samples) (12). The combination of high signal-to-analyte response and high reproducibility routinely enabled the detection of differences as small as 20%. Specificity of StaRT-PCR is ensured through quality-controlled design and production of reagents, and cross-contamination is minimized by use of Good Laboratory Practices procedures (13).
The performance characteristics achieved in the MAQC study were due to the presence of internal standards (ISs) within a Standardized Mixture of Internal Standards (SMISTM) in every measurement. Use of the SMIS controls for all known sources of variation, including intersample variation in loading due to pipetting, interfering substances such as PCR inhibitors, and intergene variation in amplification efficiency. The presence of an IS in each measurement controls for analytical false positives and false negatives. Recent reports have described the successful use of StaRT-PCR to measure the TA values of several promising biomarkers in samples of blood (14) or other tissues (15)(16)(17)(18).
Development of a drug or clinical diagnostic test requires that variation in the technical process and biological variation within test groups both be sufficiently limited to ensure a cost-to-benefit advantage for the proposed test. In this study, to explore multiple potential sources of variation, triplicate TA measurements were conducted on triplicate blood samples obtained at each of 2 visits from 15 study participants, and each detected source of variance was then quantified.
| Materials and Methods |
|---|
|
|
|---|
collection of peripheral blood samples
The sample collection protocol can be found in the Study Operations Manual (see online Data Supplement B). Venipuncture to obtain blood samples was performed on each study participant on 2 separate visits (Radiant Research). Blood was collected directly into PreAnalytiX PAXgeneTM Blood RNA Tubes (Qiagen) according to the manufacturers recommendations. On each visit, 12 tubes of blood (2.5 mL each) were sequentially collected into serially numbered tubes, mixed thoroughly, incubated for 2 h at room temperature, and then stored at 80 °C. PAXgene blood tubes were then shipped on dry ice to Asuragen via express courier service. Before collection in the PAXgene tubes for RNA extraction, 2 additional tubes of blood were collected at each visit and sent for serum chemistry and hematology analysis.
isolation and quality assessment of total rna
Total RNA isolations were carried out by Asuragen with the PAXgene Blood RNA reagent set (Qiagen) protocol with a modification to the DNase I treatment. In brief, instead of the on-column DNase I treatment, the RNA was eluted with nuclease-free water from the PAXgene Blood RNA System spin columns and subjected to a DNase I treatment in solution. RNA was subsequently purified by organic extraction. After individual tubes of blood were extracted, the RNA was combined into 3 groups of 4 tubes; tubes 1 through 4 became sample A, tubes 5 through 8 became sample B, and tubes 9 through 12 became sample C. This procedure enabled analysis of triplicate samples (each replicate sample was processed separately for collection, RNA extraction, and reverse transcription) from each individual for each visit. Any residual DNA contamination was determined by PCR with a TBP TaqMan® assay. The quality of the RNA was spectrophotometrically assessed by determining the A260:A280 ratio and by determining the RIN (RNA integrity number) on a 2100 Bioanalyzer (Agilent). Total RNA samples were checked for residual nuclease contamination and RNA stability by incubating an aliquot of each sample overnight at 37 °C and comparing the Agilent 2100 Bioanalyzer RNA electropherogram with that obtained from the freshly analyzed sample. After extraction and quality assessment, all RNA samples were shipped frozen on dry ice to Gene Express via express courier service.
reverse transcription
Total RNA was reverse transcribed with Moloney murine leukemia virus, reverse transcriptase, deoxynucleoside triphosphates, buffer, and oligo(dT) primers according to the manufacturers protocol (Invitrogen) and as described previously (15)(16).
START-PCR analysis
TA for each of 19 genes (see Table 1 in the online Data Supplement)2
was determined by StaRT-PCR (Gene Express) at the Standardized Expression Measurement CenterTM under standard operating procedures (12)(13)(15)(16).
In each assay the TA of a gene was measured relative to its respective IS within the SMIS. Analysis included 3 steps; calibration, range finding, and high-throughput analysis. In the calibration step each cDNA sample was diluted so that the ß-actin (ACTB) cDNA concentration (600 000 molecules of ACTB transcript/µL) was equivalent to the ACTB IS concentration in the SMIS (600 000 ACTB IS molecules/µL).
For each subsequent StaRT-PCR experiment a master mixture was prepared containing appropriate amounts of SMIS, calibrated cDNA sample (600 000 molecules of ACTB transcript/µl), and all PCR reagents except for primers; 18-µL aliquots of the mixture were transferred into 96-well plates containing 2 µL of gene-specific forward and reverse primers. PCRs were preheated (95 °C, 5-min) then amplified for 35 cycles (94 °C, 30 s; 58 °C, 45 s; 72 °C, 120 s). The Caliper AMS 90 automated, microfluidic capillary electrophoresis instrument (Caliper Life Sciences) with HT DNA 5000 SE30 LabChip was used to separate and quantify the PCR products. During electrophoresis, each IS and/or native template peak was stained with intercalator dye, and the HT S-GEM Suite software (Gene Express) was used to automatically quantify base-pair size and number of PCR product molecules.
In the range-finding step, an aliquot of each calibrated sample was mixed with an equal volume of E SMIS (600 000 molecules of ACTB IS and 600 molecules of target gene IS/µL) and PCR reagents. PCR amplification, detection, and analysis were performed as described above. The number of IS molecules before PCR was known for each reaction, and the HT S-GEM Suite software used this information to calculate the initial number of native template molecules in the PCR for each target gene by mathematics of ratios. Following these calculations, the software determined for each gene whether the initial native template molecule was in balance (ratio, >0.1 and <10) with E SMIS or determined which other SMIS was most appropriate in the subsequent high-throughput step. The software calculated appropriate SMIS for high-throughput analysis >95% of the time.
In the high-throughput step, PCRs were prepared with the SMIS that was calculated by HT S-GEM Suite software to be the most appropriate, and triplicate TA measurements were obtained for each gene. Results were reported as target gene cDNA molecules/106 ACTB cDNA molecules.
statistical analysis
Statistical analyses were performed by Innovative Analytics using SAS V8. Clinical assessments were summarized with descriptive statistics, including frequencies for categorical measures and means, SDs, and median, minimum, and maximum for continuous measures. We used a random-effects model to categorize total variation into between-individual and within-individual variance components. The within-individual variation was further categorized into intrasample replicate, intravisit sample, and intervisit sample terms, all 2-way and 3-way interactions, and residual error. We performed these analyses using SAS Proc Varcomp with method = REML. Percentage of total variation for each component was estimated by calculating the variance of each component as a percentage of the total variance. Correlations among intrasample replicates, intravisit sample replicates, and intervisit sample replicates, respectively, were evaluated with the Pearson correlation coefficient. BlandAltman analysis for intervisit sample variation was also performed to evaluate whether significant bias (visit 1 visit 2) existed and whether it was related to the level of expression. Technical process variation included preanalytical and analytical components. The only definable variation component attributable to replicate measurements of a single sample was analytical. Likewise, the definable variation components attributable to interindividual comparisons were biological. In contrast, both technical and biological components contributed to intravisit and intervisit sample variation.
Distributions of TA values across individuals were characterized by use of descriptive statistics and histograms, primarily to assess normality.
| Results |
|---|
|
|
|---|
After RNA was isolated from the 12 tubes of blood collected at each visit, sets of 4 tubes were combined, creating 3 RNA samples, A, B, and C. Having 3 samples enabled analysis of variation in TA measured relative to variation in RNA quantity and quality in each sample or variation in effects of reverse transcription of each sample. Each volunteer provided a total of 6 RNA samples, including the 3 samples obtained at each of 2 visits, except for study participant 23, for whom 1 sample (sample C, visit 1) was lost. RNA yield per PAXgene tube was 4.514.4 µg [mean, 8.2 µg (per sample range, 17.957.8 µg; mean, 32.8 µg)].
The indicators of sample RNA quality were the A260:280 ratios, with values of 1.812.03, and the 28S:18S ribosomal RNA ratios, which were 0.52.2. Most of the triplicate samples from any individual at any visit displayed similar 28S:18S ratios. In a few instances, however, this parameter was quite variable even among samples taken from the same individual during the same visit. For example, the 3 samples from study participant 10 taken at visit 2 had 28S:18S ratios of 1.5, 0.7, and 0.7. After an overnight incubation (37 °C) of an aliquot of each sample, the 28S:18S ratios remained relatively the same: 1.4, 0.6, and 0.7, respectively, indicating that samples were stable and free of significant nuclease contamination.
We observed large biological variation in expression among the 19 genes measured in each individual, and large biological variation in expression of each gene among the 15 participants in the study. The TA measured for most genes in most samples was above the lower limit of detection of 10 molecules/106 ACTB molecules and spanned a broad range from
10 to >105. Because of increasing stochastic effects on sampling, a value of <10 molecules/106 ACTB molecules was established a priori as nondetectable. Genes for which some or all of the 267 measurements were nondetectable due to values <10 molecules/106 ACTB molecules included MMP16, TPBG, LGALS9, CDC20, RAG1, and IFNG. As reported below, variations in detection and in measured TA were primarily related to interindividual biological variation, not analytical variation.
Total variation in TA observation for each gene was partitioned into between-individual and within-individual variation. Within-individual sample variation was further categorized into intrasample replicate variation, intravisit sample variation, intervisit sample variation, and residual biological variation (Fig. 1
). The between-individual variation and the within-individual residual variation equally accounted for almost all of the TA variation for each of the genes. The technical process variation was <5% and was primarily due to the preanalytical components (intravisit or intervisit sample variation). The analytical component, intrasample replicate assay variation, was negligible.
|
When intrasample triplicate measurements were evaluated, 95% of CVs were
32%, with a median CV of 4.9% and mean of 8.9%, indicating very good reproducibility of the analytical measurement (Fig. 2
). There was higher variation for genes with low expression, such as RAG1. This effect was probably due to increased stochastic effects on random sampling at low transcript numbers. Consistent with this conclusion, comparison of StaRT-PCR with Taqman demonstrated that increased replicate variation in association with lower gene expression was platform independent (12).
|
With respect to intravisit sample comparison, the CV was <30% for 75% of the sample sets (median, 18%; mean, 22%; Fig. 3
). Among all genes the sample-to-sample variance component was
2%3% of total variation. Thus sample-to-sample variation was greater than assay replicate variation (which was <1% of total variation).
|
Evaluation of the intervisit sample variation of TA measurement of each gene showed that 75% of the CVs were <33% (median, 19%; mean, 22%; Fig. 4
). Among all genes the visit-to-visit variance component was
2%3% of total variation. For CDC20, FANCG, IFNG, and TPNG the visit-to-visit variance component was somewhat higher, 5%10%.
|
Descriptive statistics for the TA distribution for the mean of samples A, B, and C for each gene across participants and visits are provided in Table 1
. Gaussian distribution was detected by the ShapiroWilks test for 12 of the 19 genes (63%). Genes with several nondetectable observationsCDC20, LGALS9, MMP16, and TPBGwere more skewed. Lack of gaussian distribution for these genes was due to a combination of biological interindividual variation and platform-independent stochastic effects. FCGR3A and PMS2L3 also appeared nongaussian, primarily because of biological interindividual variation in expression. TA distributions for most genes appeared fairly gaussian for these small numbers of individuals/ observations.
|
The CV of the mean indicates the assay sensitivity to detect differences in group mean TA measurement differences. Table 2
details the sensitivity of the group mean for detecting shifts with different individual group sizes. Although the SD of an observation appears high, the CV for the mean is very reasonable,
25%, with 20 observations. The CV would still be <30% with only 15 individuals, which indicates fairly good power to detect shifts in mean for different groups.
|
A change of 1 SD in group mean response is often taken as a clinically relevant change. Traditional sample size calculations show that differentiating group means in a study would require 17 participants per group to detect a 1 SD change with 80% power with a 2-sided
= 0.05 statistical test for mean differences, assuming that the disease groups will exhibit low variation similar to what has been observed in healthy volunteers.
| Discussion |
|---|
|
|
|---|
= 0.05 statistical test. Low analytical variation often increases the power of statistical tests, which translates into relatively low frequency of clinical false negatives. In our study analytical variation among intrasample replicates was low, with mean CV of 8.9%. Variation was higher among replicates for genes with low expression (e.g., RAG1, LGALS9). Based on the MAQC study, the increased replicate variation at low expression levels was a consequence of natural phenomena, specifically platform-independent stochastic random sampling error (12). If the cause of increased analytical variation at low signal levels is known to be stochastic sampling error, increasing the number of replicate measurements will decrease mean CVs, illustrating an important reason for numerical quantification. MMP16 concentrations were below the limit of detection in all samples; because MMP16 is not expressed in peripheral blood cells, this measurement was included to test the QC of the methods used (19)(20).
Some samples exhibited low 28S:18S ribosomal RNA ratios, indicating RNA degradation. Variation in the 28S:18S ratio in 1 sample compared with the others in a 3-sample set was not associated with higher CVs. For instance, samples B and C (visit 2) from study participant 23 produced neither a 28S nor 18S band that was quantifiable, indicating a high degree of degradation. Nevertheless, the results for all 19 genes in the sample were comparable to those in the other 2 samples, and the CV was no greater than the average among all sets. Although there clearly is a limit to how degraded a sample may be before it affects TA measurement with StaRT-PCR reagents, it is well below the threshold associated with complete loss of both ribosomal bands. Thus, a QC step that remains to be established is a marker for the minimum quality that an RNA sample must have for reliable measurement.
Intrasample replicate, intravisit sample, and intervisit sample variation were identifiable preanalytical and analytical components of variation. Of these, the intravisit sample variation component was greater than either the intrasample replicate or intervisit sample component. The intravisit sample component of variation likely was associated with variation in RNA extraction and/or reverse transcription. Although measurement of 3 different cDNA samples derived from the same RNA sample contributed more variation than replicate measurement of the same cDNA sample, all of these technical process variation components accounted for <5% of the total combined variation.
Validation of any clinical assessment requires a 2-step process. The 1st step is to establish the frame of reference by characterization in the normal population, including distribution of the magnitude of the measured values and the expected variability in acquisition. The quality-controlled, standardized data reported here will effectively contribute to defining reference intervals of TA measurement values in peripheral blood. The clinical utility of the data collected is enhanced by the ability to compare and combine data across institutions and experiments. Additional TA data collected in future experiments with the same cDNA samples or samples from additional healthy, diseased, and/or treated individuals will be directly comparable to the data presented here and will serve to build the knowledge base regarding measurements of this response and an understanding of how to interpret them. The 2nd step is to assess the utility of the measure for either detecting change within an individual or for detecting group differences such as healthy vs diseased persons. Such detection is possible only if the group or change difference is greater than the "noise" in the assessment. Clinical assessments with low technical variability and manageable biological variability have a greater potential to enable accurate detection of treatment effect or disease diagnosis. In conclusion, by using methods that minimize analytical variation, we obtained results indicating that clinical studies with high power to detect effects may be achieved with a relatively small number of peripheral blood samples in each group.
| Acknowledgments |
|---|
Financial disclosures: Dr. Willey has significant equity interest in Gene Express, Inc. and serves as a consultant to Gene Express, Inc.
| Footnotes |
|---|
2 Human genes: ARHGDIB, Rho GDP dissociation inhibitor (GDI) ß; BCL2, B-cell CLL/lymphoma 2 (transcript variant
); CDC20, CDC20 cell division cycle 20 homolog (S. cerevisiae); CNN2, calponin 2; FANCG, Fanconi anemia, complementation group G; FCGR3A, Fc fragment of IgG, low-affinity IIIa, receptor (CD16a); GMFG, glia maturation factor,
; IFNG, interferon,
; IL1B, interleukin 1, ß; IL8, interleukin 8; LGALS9, lectin, galactoside-binding, soluble, 9 (galectin 9, transcript variant long); MMP16, matrix metalloproteinase 16 (membrane-inserted, transcript variants 1 and 2); PMS2L3, postmeiotic segregation increased 2-like 3; PTGS2, prostaglandin-endoperoxide synthase 2 (prostaglandin G/H synthase and cyclooxygenase); RAG1, recombination activating gene 1; SLC31A2, solute carrier family 31 (copper transporters), member 2; TIMP2, TIMP metallopeptidase inhibitor 2; TNF, tumor necrosis factor (TNF superfamily, member 2); TPBG, trophoblast glycoprotein. ![]()
| References |
|---|
|
|
|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |