|
|
||||||||
Overview |
1 NIST-National Cancer Institute Cancer Biomarker Reference Laboratory,3
NIST Mass Spectrometry Data Center, and4
Analytical Chemistry Division, NIST, Gaithersburg, MD.
2 Cancer Biomarkers Research Group, Division of Cancer Prevention, National Cancer Institute, Bethesda, MD.
5 Center for Biomedical Proteomics and Center for Computational Medicine & Biology, University of Michigan, Ann Arbor, MI.
aAddress correspondence to this author at: NIST-NCI Cancer Biomarker Reference Laboratory, National Institute of Standards and Technology, Gaithersburg, MD 20899-8311. E-mail peter.barker{at}nist.gov.
Abstract
NIST and the National Cancer Institute cosponsored a workshop on August 1819, 2005, to examine needs for reference materials for early cancer detection. This meeting focused on standards, methods, assays, reagents, and technologies. Needs for plasma and serum proteomics, DNA methylation, and specimen reference collections were discussed, and recommendations from participants were solicited. This report summarizes the discussion and recommendations for proteomics reference materials.
Collection and analysis of comprehensive proteome data will be considerably more challenging than the human genome project. Whereas the human genome encodes
22 000 genes(1), the human proteome is far more complex(2)(3). Multiple protein isoforms arise from the same gene by differential splicing, multitudinous posttranslational modifications(4)(5), wider chemical diversity among proteins than nucleic acids, and ex vivo degradation in specimens. All present substantial technical and data analysis challenges unique to proteomics.
Without universal preanalytical processing and data standards for proteomics, comprehensive analysis is problematic(6). For early cancer detection, reports of novel protein biomarkers in plasma and serum have been difficult to reproduce(7). Consequently, proponents of plasma and serum proteomic biomarker discovery have argued for large-scale, international collaboration to optimize efficiency of global efforts(8)(9)(10)(11). To be effective, this endeavor requires standardization, as well as common reference materials.
Genetic heterogeneity of human populations and variability of human plasma and serum proteins represent additional measurement challenges. Superimposed on interindividual biological variation of primary protein sequences are expressional differences that may vary with sex, age, and physiological status (diet, smoking, exercise, hypoxia, medications, and menstrual cycle). Although reference intervals have been defined for many clinically important protein analytes, few new protein biomarkers have been validated(12). A comprehensive catalogue of all human blood proteins, or even normal and abnormal variability, remains a significant unmet challenge for the future.
Current and Future Physical Standards for Proteomics
The metrology community distinguishes primary from secondary standard reference materials(13). NIST (http://www.nist.gov), the United Kingdoms National Institute for Biological Standards and Control (http://www.nibsc.ac.uk/), and the Belgian Institute for Reference Materials and Measurements (http://www.irmm.jrc.be) produce primary reference materials. NIST-certified standard reference materials (SRMs)1
are certified by 2 or more independent measurement methods to reduce measurement bias. Secondary reference materials typically are those produced by commercial sources for widespread distribution. Secondary reference materials are metrologically linked to primary reference materials through a well-characterized value-transfer protocol. SRMs sold by NIST are universally available without restriction. Reference materials can be classified by composition, intended use, and developer/supplier (Table 1
).
|
The international Human Proteome Organization (HUPO) Plasma Proteome Project(14)(15) is a major initiative using proteomic reference specimens that has generated extensive interlaboratory data for protein identifications in serum, EDTA-, heparin-, and citrate-anticoagulated human plasma(11).
The Association of Biomolecular Research Facilities (ABRF) (http://www.abrf.org) has developed protein collections for the mass spectrometry community(16). The ABRF, an international society, is dedicated to advancing research biotechnology laboratories. Its Proteomics Standards Research Group supports proteomic standards (materials, data, and procedures) and is currently sponsoring a collaborative project whose goals include development of validated protein standards containing defined proteins, supplying standard mixtures as test sets for member laboratories, and making ABRF standards available to the proteomics community. In collaboration with NIST, ABRF has under way the development of a 3-peptide SRM in which C-terminal arginines mimic tryptic peptides commonly amenable to liquid chromatographymass spectrometry (LC-MS) analysis. This mixture is designed as a standard for matrix-assisted laser desorption/ionization mass spectrometry, electrospray mass spectrometry, amino acid analysis, HPLC, and capillary electrophoresis. NIST is currently finalizing the certificate of analysis for RM 8327 (Table 1
).
In addition, individual laboratories have assembled databases for proteomic reference materials (Table 2
). Data standards are under development in several sites and organizations. For tandem mass spectrometry (MS/MS) data sets, 2 complementary approaches have resulted from international collaboration, mzDATA(17) and mzXML(18). These are undergoing consolidation under the auspices of the HUPO Protein Standards Initiative led by R. Apweiler of the European Bioinformatics Institute (Hinxton, United Kingdom). Such data standards, although critical to technology platform comparisons, have been addressed elsewhere as guidelines for publication(19) and in a tutorial for protein inference from shotgun proteomics(20).
|
Recommendations for physical standards for plasma and serum proteomics were the focus of the NIST-National Cancer Institute workshop, and the efforts outlined above address specific applications. As the proteomics field approaches clinical applications on a global basis, however, proteomics reference standards with certified purity and metrological rigor are an increasing need that might be addressed by universal consensus plasma and serum proteomics reference materials to benchmark technologies developed for early cancer detection.
Which Reference Materials Are Needed?
peptides
Standard mixtures of pure, well-characterized synthetic peptides would be useful for benchmarking MS techniques, quality control for chromatography and capillary electrophoresis, generating MS/MS spectral databases, and enriching analytical samples as internal controls. Standard composition depends on intended use. Peptide reference materials for evaluation and standardization of chromatographic separations should have favorable chromatographic retention characteristics, whereas those for MS should generate mass spectra of established intensity and complexity.
The reference peptide mixtures should have a range of mass and elution times on LC-MS and be free of oxidizable residues that compromise chemical stability. Peptides with posttranslational modifications would also be useful for certain applications. For example, standards for glycopeptide and phosphopeptide identification should contain appropriately defined components for analysis.
A particularly promising new approach uses proteotypic peptides to score proteins for identification and for relative or absolute quantification(21)(22). These peptides, identified from databases or from large empirical studies, including pooled sample studies (http://www.peptideatlas.com), have sequences uniquely matching a single protein and physiochemical properties favoring detection in MS/MS experiments(21). Synthesis and isotopic labeling of such peptides enables large-scale enrichment to generate paired peptide ions and peptide ion fragments from which the relative concentration can be calculated(23).
Cost to end users is also important. Purity and cost must be balanced. In some cases, this will entail high-purity primary standards from NIST and secondary commercially available standards that might be more cost-effective but are still at a purity sufficient for general use.
single proteins
Single proteins, or protein sets, are needed as standards for highly accurate quantitative protein measurements, for which they will facilitate determination of the accuracy of diverse techniques and standardization of assays performed in different laboratories. Proteins might be chosen with clinical relevance in mind. Coordinated high-affinity, highly specific, and well-characterized antibodies against these same proteins would enhance such reference materials, because many current clinical test platforms are based on affinity methods.
Several major protein cancer biomarker tests in blood have been cleared by the FDA for clinical use. These include free and complex prostate-specific antigen for prostate cancer, CA153/CA27.29 and HER2/neu for breast cancer, CA125 for ovarian cancer, carcinoembryonic antigen for colorectal cancer, CA199 for pancreatic cancer,
-fetoprotein for testicular cancer, and
-fetoprotein-L3 for liver cancer. Many of the CA series are glycosylated variants of the MUC1 gene product and thus problematic for analytical chemical quantification. The WHO and the National Institute for Biological Standards and Control have developed standards for several of these cancer markers.
mixtures of defined proteins
Standards consisting of mixtures of purified and well-characterized proteins allow laboratories to determine the analytical detection limit and specificity of their techniques, compare data across laboratories, and determine the analytical performance of new methods. For example, a standard containing a mixture of 35 proteins may be added to study samples to control for tryptic digestion and to detect protein or peptide abundance. These protein standards are also needed to monitor performance and technical variability of instrumentation and as MS mass calibrants for quality control and assurance protocols.
Several workshop participants recommended that varied ratios and concentrations of the protein standard mixtures be used for quantitative MS. As with peptide standards, protein mixtures containing representative phosphoproteins and glycoproteins are needed because of the importance of such posttranslational modifications in cancer-related biochemical pathways that might include biomarkers in future work.
Although commercial sources supply both single proteins and protein mixtures, quality is variable, and frequently, purity and concentration values are not rigorously established. Some investigators have used protein purchased from commercial sources to create standard mixtures. Some workshop participants felt that the purity was not critical in, for example, generation of MS/MS spectra data sets to evaluate computational peptide/protein identification methods, whereas other participants felt that purity was critical for generating MS/MS data sets. Standards for differentiating between samples by LC-MS data will require highly purified proteins with highly reproducible retention times and accurate mass determination.
Anticipated uses of these categories of standards (peptides, proteins, and mixtures) include benchmarking technologically diverse proteomic analysis platforms, quantitative calibration, method development of plasma/serum processing and recovery experiments, and mass calibration for spectrometry measurements of peptides and tryptic peptide digests. However, the standard reference materials listed above are not intended as controls for detecting differences between clinically normal and abnormal specimens.
complex mixtures
Matrix-based reference materials such as serum and plasma address the need for assessment of temporal stability in instrumentation, optimization, and instrument performance and cross-platform comparisons to assess bias. A standardized, pooled serum or plasma preparation would address the majority of community needs for such a complex mixture. Although the volume of serum/plasma from one individual would be limited, plasma pools can supply aliquots linked to proteomic characterization measurements made on the pooled plasma. Homogeneous pooled samples also minimize individual differences in protein concentration. Because of potential interferences, however, pooled serum and plasma are not appropriate for research on autoantibodies. The concentrations of several known proteins in the pooled sera and plasma should be accurately determined and made available along with sample aliquots and documentation of technical details. Prior consensus documents on preparation of pools of human sera and plasma for various uses are available from the CLSI (http://www.nccls.org). Although it is important to define recommendations on standard methods of serum and plasma collection (standard operating procedures) and biological attributes of pool population members, the requirements for each application are so specific that consensus on this level of detail was beyond the scope of the general proteomic standards discussion at this workshop.
For proteomic research related to cancers, it may be advantageous to enrich serum or plasma pools with defined concentrations of
15 known cancer biomarkers, such as prostate-specific antigen, CA125, CA199, carcinoembryonic antigen, and
-fetoprotein. Consideration of the biological features of the added proteins is important so that unusually unstable proteins or those that might be especially susceptible to proteolysis can be avoided. A balance of catalytic and structural added proteins should be used, and issues such as stable complex formation might also be considered, depending on the assays envisioned for their quantification. Such types of standards are relevant for MS, as well as antibody assays. A series of serum or plasma standards that contain enriched biomarkers at clinically high, low, and intermediate concentrations would be useful, but development of such a series may not be practical. A pool (plasma or serum) that investigators could enrich with various concentrations of the proteins might give the best value for the production effort.
Standardized procedures for plasma and serum specimen collection, processing, aliquoting, and storage are also critically needed. NIST could establish favorable storage conditions and technology. Considering the importance of these processes, it is surprising that there has been no comprehensive survey of related experimental variables, despite the fact that a number of institutions and firms routinely collect large volumes of blood products.
The HUPO Plasma Proteome Project compared results obtained with serum and EDTA-, heparin-, and citrate-anticoagulated plasma specimens from the same donors. EDTA plasma was recommended as the preferred standard specimen, because with its considerable variability, plasma avoids the additional time ex vivo for clotting, and EDTA gave more consistent results with less proteolytic cleavage than heparin- or citrate-anticoagulated plasma(11)(15)(24). For particular analytes, other choices may be indicated. Also, because most archived specimens are sera rather than plasma, interest in serum reference materials will continue(25).
recommendations
The need for plasma and serum proteomics standards is immediate. After establishment of background measurement methods, most NIST SRM projects take 1218 months. By collective effort within the global proteomics community and with federal participation, high-grade proteomic standard materials should increase the likelihood of clinical benefits deriving from developments in proteomics.
Acknowledgments
Certain commercial equipment or materials are identified in this paper to specify adequately the experimental procedures. Such identification does not imply recommendation or endorsement by the NIST, nor does it imply that the materials or equipment identified are necessarily the best available for the purpose. This work was supported in part by NIST-National Cancer Institute Interagency Agreement Y1-CN-45016 (to P.E.B.). G.S.O. is supported by Grant MTTC 687 to the Michigan Proteomics Alliance for Cancer Research, NIH Grant U54DA021519 to the National Center for Integrative Biomedical Informatics, and National Cancer Institute/Scientific Applications International Corp. Contract 23XS110A for the Eastern Consortium for Mouse Models of Human Tumors.
Footnotes
1 Nonstandard abbreviations: SRM, standard reference material; RM, reference material; ABRF, Association of Biomolecular Resource Facilities; HUPO, Human Proteome Organization; LC-MS, liquid chromatographymass spectrometry; MS, mass spectrometry; MS/MS, tandem mass spectrometry. ![]()
References
The following articles in journals at HighWire Press have cited this article:
![]() |
D. Cervi, T.-T. Yip, N. Bhattacharya, V. N. Podust, J. Peterson, A. Abou-Slaybi, G. N. Naumov, E. Bender, N. Almog, J. E. Italiano Jr, et al. Platelet-associated PF-4 as a biomarker of early tumor growth Blood, February 1, 2008; 111(3): 1201 - 1207. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |