|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Molecular Diagnostics and Genetics |
1 National Health and Environmental Effects Research Laboratory, Office of Research and Development, US Environmental Protection Agency, Research Triangle Park, NC.
2 Curriculum in Toxicology, University of North Carolina at Chapel Hill, Chapel Hill, NC.
3 Department of Environmental and Molecular Toxicology, North Carolina State University, Raleigh, NC.
aAddress correspondence to this author at: National Center for Computational Toxicology (D343-03), Office of Research and Development, US Environmental Protection Agency, Research Triangle Park, NC 27711. Fax 919-541-1194; e-mail dix.david{at}epa.gov.
| Abstract |
|---|
|
|
|---|
Methods: Whole blood was obtained from adult male and female volunteers (n = 42) and stored at various temperatures for various lengths of time. RNA was isolated and RNA quality analyzed. Affymetrix GeneChips (n = 23) were used to characterize gene expression profiles (GEPs) and to determine the effects on GEP of storage conditions, extraction techniques, types of GeneChip, or donor sex. Hierarchical clustering and principal component analysis were used to assess interindividual differences. Regression analysis was used to assess the relative impact of the studied variables.
Results: Storage of blood samples for >1 week at 4 °C diminished subsequent RNA quality. Interindividual GEP differences were seen, but larger effects were observed related to RNA extraction technique, GeneChip, and donor sex. The relative importance of the variables was as follows: storage < genechip < extraction technique < donor sex.
Conclusion: Sample storage and extraction methods and interindividual differences, particularly donor sex, affect GEP of human whole blood.
| Introduction |
|---|
|
|
|---|
Investigation of the effects of biological and technological variables is essential for confident use of whole blood GEP in research and clinical studies and has been the aim of several studies (Table 1
). Whitney et al. (5) used cDNA arrays to analyze blood from healthy volunteers and found that interindividual sample variation was associated with donor sex and age, the time of day the sample was taken, and the proportions of blood cell subsets. Interindividual variation has been demonstrated, but significant variation in repeated sampling of the same individuals was not reported (6)(7). Other studies (8)(9)(10)(11)(12)(13)(14)(15)(16) examined how GEP of blood samples is affected by storage conditions and different RNA extraction and amplification techniques and have reported that both these factors can affect GEP.
|
In the current study, sources of technical and biological variation were characterized with respect to microarray-based GEP of human whole blood. Specifically, the effects of storage of whole blood on overall RNA quality and GEPs were examined by comparison with freshly prepared RNA. In addition, 2 RNA isolation techniques and 2 Affymetrix GeneChips were compared, as was biological variation based on sex.
| Materials and Methods |
|---|
|
|
|---|
sample collection, transportation, and storage
Blood was collected and processed according to the manufacturers instructions in PAXgene Vacutainer tubes (PreAnalytix/Qiagen; 2.5 mL blood per tube) and 1.5-mL Eppendorf tubes containing 0.6 mL ZR buffer (Zymo Research; 0.2 mL blood per tube). Two tubes of each type were collected from each study participant. The PAXgene tubes were stored at ambient temperature, and were transported within 4 h of collection (also at ambient temperature) from the clinical office to the analytical laboratory, a journey time of
30 min. ZR tubes were stored on ice and within 4 h of collection transported (also on ice) from the clinical office to the analytical laboratory. On reaching the laboratory the tubes were either immediately processed to extract RNA or else stored at 20 °C, 4 °C, or room temperature.
rna isolation and quality analysis
RNA was isolated with the PAXgene Blood RNA Isolation Kit or the ZR Whole-Blood Total RNA Kit according to the manufacturers instructions. The RNA from the PAXgene tubes was extracted on the day of collection, after 12 days of storage at room temperature, after 140 days of storage at 4 °C, or after 98194 days of storage at 20 °C. The RNA from the ZR tubes was extracted either immediately or after 14 days of storage at 4 °C. All extracted RNA samples were quantified by absorbance readings at 260 and 280 nm performed with a GeneQuant spectrophotometer (Pharmacia Biotech) and then stored at 80 °C.
PCR analysis of each RNA sample was conducted to ensure absence of contaminating genomic DNA. The PCR contained 12.5 µL 2x PCR Master Mix (Promega), 1 µL RNA template (30 ng), and 1 µL human ß-actin primers (10 µmol/L; Promega) in a final 25-µL reaction volume. Cycling was as follows: 2 min at 94 °C as an initial denaturation step; 40 cycles of 94 °C for 30 s, 65 °C for 1 min, and 68 °C for 2 min; and then a final extension step at 68 °C for 7 min. PCRs were run on a 2% agarose gel to identify those samples containing genomic DNA. All samples were treated with DNase (DNA-free reagent set; Ambion) for 30 min at 37 °C followed by a clean-up step, according to the manufacturers instructions. After DNase treatment, RNA was recovered by ethanol precipitation: 1/10th volume of sodium acetate (pH 5.2) and 2 volumes of 100% ethanol were added to the sample, and the mixture incubated for 1 h at 20 °C. RNA was then recovered by centrifugation (20 817g for 30 min at 4 °C). The RNA pellet was washed with 75% ethanol and resuspended in 20 µL RNase-free water.
We analyzed 1 µL of each RNA sample with an RNA 6000 Nano Lapchip reagent set (Agilent Technologies) with a 2100 Bioanalyzer (Agilent Technologies) according to the manufacturers instructions. RNA integrity was determined by 28S:18S ribosomal RNA ratio and RNA integrity number (RIN; Agilent 2100 RIN Beta Version Software). Mean (SE) values for RIN and 28S:18S rRNA ratio were calculated, and a mixed-effects linear model was used to look for differences among all storage temperature by day groups. The model was a one-way ANOVA to which was added a random predictor for each study participant. This adjusted the covariance matrix for having 2 observations from each study participant. Pairwise t-tests were calculated between each temperature by day group and the fresh sample group. An additional analysis was run for data for all time points pooled within each storage temperature group.
probe preparation and hybridization
Probe preparation and GeneChip hybridization were conducted by Expression Analysis. Each RNA sample selected for microarray hybridization (see Table 1 in the Data Supplement that accompanies the online version of this article at http://www.clinchem.org/content/vol53/issue6 for details of specific samples used in microarray hybridizations) was amplified with the MessageAmp aRNA reagent set (Ambion), according to manufacturers instructions. The amplified RNA was used to synthesize cDNA. In vitro transcription was performed to produce biotin-labeled cRNA according to the manufacturers instructions. Biotin-labeled cRNA was purified with an RNeasy reagent set (Qiagen). Labeled cRNA was fragmented and hybridized overnight to Affymetrix Human Genome Focus (HG-Focus; 8400 genes) or Human Genome U133 Plus 2.0 (HG-U133 Plus 2.0; 39 000 genes) GeneChips (Affymetrix) in an Affymetrix Fluidics Station 400. After being washed, the arrays were stained with phycoerythrin-conjugated streptavidin, amplified by biotinylated antistreptavidin, and then scanned in a GCS3000 (Affymetrix). QC metrics for probe preparation and hybridization are included in Tables 2 and 3 in the online Data Supplement.
microarray data analysis
Affymetrix Microarray Suite 5.0 (MAS 5.0) was used for calculation of the signal and determination of the "present" call. These data will be deposited in the National Center for Biotechnology Informations Gene Expression Omnibus database (http://www.ncbi.nlm.nih.gov/projects/geo/). The data were transferred to GeneSpring 6.1 (Silicon Genetics) for further analysis. By use of normalization options in GeneSpring, the raw data were globally normalized to the median of each array, and each gene to its median. For comparison between HG-focus and HG-U133Plus2.0 genechips, only the data for genes common to both chips were transferred to GeneSpring.
effect of extraction technique, storage, donor sex, and genechip on gep
Five sample RNAs from male donors were isolated with each of the 2 extraction techniques and used to determine the effect of extraction technique on GEPs. Samples from 5 female donors were used to determine the effect of long-term frozen storage. To determine the effect of donor sex, expression profiles of the RNA freshly isolated from PAXgene samples from the 5 male and 5 female individuals were compared. To determine whether there was any genechip effect, RNA from 3 blood samples that had been isolated within 24 h of collection was hybridized to both HG-focus and HG-U133 Plus 2.0 genechips. Group comparisons were performed in GeneSpring, using two-way ANOVA to test for effects of storage, extraction method, or genechip while adjusting for any donor effect. A one-way ANOVA was used to test for effects of donor sex. The ANOVA options selected in GeneSpring were parametric test, not assuming equal variances, and a false discovery rate of 0.05. Genes with Affymetrix present calls in at least 5 of 10 arrays (for storage, extraction, and donor sex analyses) or 3 of 6 arrays (for chip analyses), and with P <0.01 for the effect of interest were selected as differentially expressed genes.
We performed an additional ANOVA analysis using SAS software (SAS Institute). MAS5 values were log (base 2) transformed, and a global normalization was used, normalizing each of the 23 unique data files to its median. A total of 4299 genes were identified as being present in at least 5 of 10 female chips, 5 of 10 male chips, or 3 of 6 of the HG-focus and HG-U133Plus2.0 genechips. For each gene, an ANOVA was performed with SAS Proc GLM, for which normalized log intensity was predicted by sex, donor (nested within sex), genechip, extraction, and storage. The mean variance due to each factor was examined by comparing the mean square values. This analysis was repeated with quantile rather than median normalization.
The "Find the similar samples" function in GeneSpring calculates pairwise correlation coefficients between a target sample and each member of a list of other samples. This function was used to calculate correlation coefficients between all pairs of arrays. Means of these coefficients were calculated to find mean inter- and intragroup correlations. Principal component analyses were performed in Genespring on those genes that were present in at least 5 of 10 arrays (PAXgene vs ZR, fresh vs stored, male vs female) or at least 3 of 6 arrays (HG-Focus vs HG-U133 Plus 2.0).
blood transcriptome
The blood transcriptome was characterized with data obtained from the HG-Focus genechips after hybridization with RNA from the 10 samples (5 male, 5 female) that had been isolated by use of PAXgene within 24 h of blood collection. To identify genes expressed across all samples, the FLAGS filter of GeneSpring was used.
| Results |
|---|
|
|
|---|
2.0. The 28S:18S ribosomal band ratios for 80 RNA samples (4 volunteers could not provide a 2nd tube of blood) were
2.0 for 40 (50%) of samples, 1.82.0 for 14 (17.5%), and <1.8 for 26 (32.5%; see Table 1 in the online Data Supplement). The mean 28S:18S rRNA ratios were 2.54 for the freshly isolated samples, 2.23 for the samples stored at 4 °C (all storage times), and 1.49 for the samples stored at 20 °C (all storage times).
The mean (range) total RNA yields from each ZR tube (0.2 mL blood) were 1.0 (0.33.4) µg (see Table 4 in the online Data Supplement). The 28S:18S ribosomal band ratios were
2.0 for 1 (4.5%) of the 22 samples, 1.82.0 for 1 (4.5%), and <1.8 for 20 (91%). The mean 28S:18S rRNA ratios were 1.52 for samples (n = 15) extracted up to a day after blood collection, and 1.47 for samples (n = 7) extracted 24 days after blood collection [1.29 if sample 10M( (1)) was treated as an outlier]. Statistical analysis comparing all samples showed no significant differences in quality (as determined by ribosomal ratio) between the freshly isolated RNA (01 days) and RNA from stored samples (24 days). However, if sample 10M(1), which appeared to be an outlier, was removed from the 24 day group, the mean 28:18S ratio of this group was significantly lower than the mean of the 01 day group (P <0.05). We used the Agilent Bioanalyzer to assess RNA integrity (17). This system uses an RIN scale on which an RIN of 1 is the most degraded and 10 is the most intact. The RIN was calculated for all PAXgene samples. The statistical analysis of RNA quality by temperature and storage days is summarized in Table 2
. Results indicated that RNA in samples in PAXgene tubes were stable for up to 1 day of storage at room temperature and up to 4 days at 4 °C and moderately stable for up to 194 days at 20 °C.
|
effect of rna extraction technique, storage, donor sex, and genechip on expression profile
Group comparisons of 8500 genes identified 264 genes differentially expressed between RNAs extracted by the 2 different methods. A total of 67 genes were identified as being differentially expressed in the group of frozen samples compared with the group of their respective freshly extracted samples. Group comparisons identified 748 genes differentially expressed between sexes; 600 of these showed higher expression in samples from female donors than in samples from male doors, whereas 148 showed lower expression in samples from females vs males. To identify which genes with differential expression related to sample-donor sex were likely to have a sex-specific role, we selected transcripts with a mean expression difference of 6-fold or greater between the male and female sample groups, a total of 12 genes, of which 9 had higher expression and 3 had lower expression in samples from female than from samples in male donors (Table 3
).2
Biotin-labeled amplification products from 3 samples (36F, 37F, and 38F) were each split and hybridized to both the HG-Focus chip and the HG-U133 Plus 2.0 chip. Group comparisons identified 435 genes differentially expressed between the chips.
|
An additional ANOVA analysis, using median normalization, showed the mean square due to storage (mean, 0.256; median, 0.084) was the smallest, and donor sex (mean, 1.378; median, 0.391) the largest. Genechip (mean, 0.751; median, 0.190), extraction (mean, 0.684; median, 0.191), and sample donor (mean, 0.331; median, 0.194) had similar median mean square values, but could be separated according to their mean squares as sample donor < extraction < genechip. We also performed ANOVA analysis after quantile normalization, for which both the partition of variance and the number of genes that were significantly different between groups indicated storage < sample donor < genechip < extraction < donor sex.
correlation analysis
Inter- and intragroup correlation coefficients (R values) were calculated to determine the overall congruency among the samples in each of the tested groups. Correlation analysis showed variation due to storage < genechip < extraction technique < donor sex. Intergroup variability was greater than intragroup variability except for the frozen group, demonstrating more effects than interindividual differences. Mean values for sex were 0.74 between groups, and 0.838 (male) and 0.887 (female) within groups. For extraction, the mean was 0.816 between groups, and 0.838 (PAX) and 0.870 (ZR) within groups. For storage, the mean was 0.867 intergroup, and 0.887 (fresh) and 0.835 (frozen) intragroup. And finally, for genechip, the mean was 0.844 intergroup, and 0.903 (HG-Focus) and 0.916 (HG-U133 Plus 2.0) intragroup.
cluster analysis
In unsupervised principal component analyses, GEPs from the paired fresh and frozen samples showed partitioning in part according to individual sample donor, although the results were mixed (Fig. 1A
). GEP partitioning for the paired PAXgene and ZR samples occurred sharply according to method rather than individual sample donor (Fig. 1B
); male and female donor profiles also partitioned sharply from one another (Fig. 1C
), as did profiles generated on the different genechips (Fig. 1D
).
|
blood transcriptome
Results for gene expression were 2558) genes expressed in all 10 samples, 2629 genes expressed in all 5 samples from male donors, 3343 expressed in all 5 samples from female donors, and 4013 unique genes (from a total of 8500 unique probe sets = 8400 unique Entrez IDs) expressed in at least 1 of the 10 individual samples hybridized to the HG-Focus Arrays.
| Discussion |
|---|
|
|
|---|
Even without incorporating a globin decrease protocol (19), a present call level in the 36%48% range was obtained from both PAXgene and ZR samples after genechip hybridization, a result that is close to previously reported values (12). RNA amplification was needed for the limited amount of blood RNA obtained from both extraction methods, especially from the ZR method, for which the small volume of blood used (0.2 mL) generally yielded <1 µg of total RNA. Thus, 2 rounds of RNA amplification were required. Amplified RNA probes may undergo truncation during reverse transcription and T7-mediated transcription, resulting in short-labeled products (20). This effect can be detected in the GAPDH 3':5' signal ratio number (21), and GAPDH ratios of the female samples in this study were near the upper end of this range after 2 rounds of amplification. Ratios for the male samples were higher, however, indicating that the 2 rounds of amplification produced truncated transcripts. The percentage present calls for the male samples were lower than for the female samples, a result at least partially attributable to truncation of transcripts in the 2 rounds of amplification. The reasons underlying this sex-based difference remain to be determined.
In terms of numbers of genes for which expression was significantly different between groups, frozen storage produced the smallest effect, and donor sex produced the greatest effect. When median normalization was used, the partition of variance indicated that variance due to storage < extraction < genechip < donor sex. When quantile normalization was used rather than normalization of each chip to the median, both the partition of variance and the number of genes that were significantly different between groups indicated storage < genechip < extraction < donor sex. This result agreed with the correlation analysis and can be explained by different distribution of values for the 3 pairs of arrays with different chips for the same samples. With quantile normalization, all 6 chips are forced to have the same distribution, so the high values in the HG-U133 chips are brought in line with the high values in the HG-U133 Plus 2.0 chips, decreasing the number of significant changes between the 2 while preserving the correlation. Not surprisingly, several of the genes differently expressed between sexes were sex chromosomelinked: XIST, EIF1AY, RPS4Y1, and DDX3Y. The present study confirmed previous findings that biological variation in gene expression exceeds variation produced by technical factors.
The results of this study demonstrate that microarray-based transcript profiling in clinical studies will require standardized protocols for collection, transportation, and storage of biospecimens; RNA extraction and amplification; microarray probe labeling and hybridization; and statistical analysis of results. Without this standardization, results will not be reproducible or comparable. Standardization will afford better understanding of how technical and biological factors affect blood GEP, and comparisons of standardized studies will generate the power to determine significant sources of variation of gene expression in blood samples. Only when robust standardized procedures are used and gene expression in human blood under "normal" conditions is well characterized will it be possible to identify robust genomic biomarkers of environmental exposure or disease progression.
| Acknowledgments |
|---|
Financial disclosures: None declared.
Acknowledgements: We thank Maryann Bassett, Debbie Levine, and Tracey Mantilla (US Environmental Protection Agency) for assistance in sample collection. We thank Dr. Don Graff (US Environmental Protection Agency) for critically reviewing this work before submission. The information in this document has been funded wholly by the US Environmental Protection Agency. It has been subjected to review by the National Health and Environmental Effects Research Laboratory and approved for publication. Approval does not signify that the contents reflect the views of the Agency, nor does mention of trade names or commercial products constitute endorsement or recommendation for use.
| Footnotes |
|---|
4 Current affiliation: Reproductive Toxicology, Pharmaceutical Research Institute, Bristol-Myers Squibb, New Brunswick, NJ. ![]()
5 Current affiliation: Department of Pharmacology and Toxicology, Brody School of Medicine, East Carolina University, Greenville, NC. ![]()
6 Current affiliation: Rosetta Inpharmatics LLC (a wholly owned subsidiary of Merck & Co., Inc.), Seattle, WA. ![]()
1 Nonstandard abbreviations: GEP, gene expression profile; RIN, RNA integrity number. ![]()
2 Human genes: SGKL, serum/glucocorticoid regulated kinase family, member 3; XIST, X (inactive)-specific transcript; CPNE3, copine III; ECGF1, endothelial cell growth factor 1 (platelet-derived); STX16, syntaxin 16; DICER1, Dicer1, Dcr-1 homolog (Drosophila); CASP8AP2, CASP8-associated protein 2; SAS10, disrupter of silencing 10; ADD3, adducin 3 (
); EIF1AY, eukaryotic translation initiation factor 1A, Y-linked; RPS4Y1, ribosomal protein S4, Y-linked 1; DDX3Y, DEAD (Asp-Glu-Ala-Asp) box polypeptide 3, Y-linked. ![]()
| References |
|---|
|
|
|---|
The following articles in journals at HighWire Press have cited this article:
![]() |
J A Krawiec, H Chen, S Alom-Ruiz, and M Jaye Modified PAXgeneTM method allows for isolation of high-integrity total RNA from microlitre volumes of mouse whole blood Lab Anim, October 1, 2009; 43(4): 394 - 398. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. P.Y. Fan, C. Di Liao, B. Y. Fu, L. C.W. Lam, and N. L.S. Tang Interindividual and Interethnic Variation in Genomewide Gene Expression: Insights into the Biological Variation of Gene Expression and Clinical Implications Clin. Chem., April 1, 2009; 55(4): 774 - 785. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |