|
|
||||||||
Oak Ridge Conference |
1
Division of Cancer Prevention, National Cancer Institute, Rockville, MD 20852.
2
Cancer Biomarkers Research Group, Division of Cancer Prevention, National Cancer Institute, 6130 Executive Blvd., Rm. EPN 330F, Rockville, MD 20852.
3
Pediatrics and Communicable Diseases, University of Michigan, Ann Arbor, MI 48109.
4
Department of Microbiology and Molecular Cell Biology and Virginia Prostate Center, Eastern Virginia Medical School, Norfolk, VA 23501.
aAuthor for correspondence. Fax 301-402-0816; e-mail ss1a{at}nih.gov.
Abstract
Early detection is critical in cancer control and prevention. Biomarkers help in this process by providing valuable information about a the status of a cell at any given point in time. As a cell transforms from nondiseased to neoplastic, distinct changes occur that could be potentially detected through the identification of the appropriate biomarkers. Biomarker research has benefited from advances in technology such as proteomics. We discuss here ongoing research in this field, focusing on proteomic technologies. The advances in two-dimensional electrophoresis and mass spectrometry are discussed in light of their contribution to biomarker research. Chip-based techniques, such as surface-enhanced laser desorption, and ionization and emerging methods, such as tissue and antibody arrays, are also discussed. The development of bioinformatic tools that have and are being developed in parallel to proteomics is also addressed. This report brings into focus the efforts of the Early Detection Research Network at the National Cancer Institute in harnessing scientific expertise from leading institutions to identify and validate biomarkers for early detection and risk assessment.
The biologic dictates of an organism are largely governed through the structure and function of the products encoded by the genes, the most functional of which is the proteome. Originally defined to represent the analysis of the entire protein component of a cell or tissue (1), proteomics now encompasses the study of expressed proteins, including identification and elucidation of the structure-function relationship under healthy conditions and disease conditions, such as in cancer. In combination with genomics, proteomics can provide a holistic understanding of the biology underlying disease processes. Information at the level of the proteome is critical for understanding the function of specific cell types and their role in health and disease. Mammalian systems are much more complex than can be deciphered by their genes alone. Expression analysis directly at the protein level is necessary to unravel the critical changes that occur as part of disease pathogenesis. This is because proteins are often expressed at concentrations and forms that cannot be predicted from mRNA analysis. Proteomics also provides an avenue to understand the interaction between the functional pathways a cell and its environmental milieu, independent of any changes at the RNA level.
proteomics in cancer research
Cancer proteomics encompasses the identification and quantitative analysis of differentially expressed proteins relative to healthy tissue counterparts at different stages of disease, from preneoplasia to neoplasia. Proteomic technologies can also be used to identify markers for cancer diagnosis, to monitor disease progression, and to identify therapeutic targets. Proteomics is valuable in the discovery of biomarkers because the proteome reflects both the intrinsic genetic program of the cell and the impact of its immediate environment. Protein expression and function are subject to modulation through transcription as well as through posttranscriptional and translational events. More than one RNA can result from one gene through a process of differential splicing. Additionally, there are more than 200 posttranslation modifications that proteins could undergo that affect function, protein-protein and nuclide-protein interaction, stability, targeting, half-life, and so on (2), all contributing to a potentially large number of protein products from one gene. At the protein level, distinct changes occur during the transformation of a healthy cell into a neoplastic cell, ranging from altered expression, differential protein modification, and changes in specific activity, to aberrant localization, all of which may affect cellular function. Identifying and understanding these changes are the underlying themes in cancer proteomics. The deliverables include identification of biomarkers that have utility both for early detection and for determining of therapy.
Although proteomics traditionally dealt with quantitative analysis of protein expression, more recently, proteomics has been viewed to encompass the structural analysis of proteins (3). Quantitative proteomics strives to investigate the changes in protein expression in different states, such as in healthy and diseased tissue or at different stages of the disease. This enables the identification of state- and stage-specific proteins. Structural proteomics attempts to uncover the structure of proteins and to unravel and map protein-protein interactions.
proteomic technologies
The rapid development and integration of analytical instrumentation combining reproducibility and sensitivity have accelerated the field of proteomics. Two-dimensional gel electrophoresis (2-DE)
1
has been the workhorse in quantitative proteomics. Typically, protein samples are denatured and separated on the basis of their charge through isoelectric focusing. Immobilized pH gradients have greatly enhanced reproducibility in resolving almost the complete spectrum of basic to acidic proteins and have allowed both analytical and preparative amounts of proteins to be resolved (4)(5). Narrow-range pH gradients are helping increase protein detection. A range of 1 pH unit was demonstrated to resolve 1000 protein spots (6). The proteins are further separated by migration in a polyacrylamide gel on the basis of their molecular weights. By use of silver staining techniques,
3000 proteins can be visualized on a single gel. Fluorescent dyes are being developed to overcome some of the drawbacks of silver staining in making the protein samples more amenable to mass spectrometry (7)(8). Stained gels can then be scanned at different resolutions with laser densitometers, and data can be analyzed with software such as PDQUEST (9) for spot detection and quantification. Ratio analysis is used to detect quantitative changes in proteins between two samples. 2-DE is currently being adapted to high-throughput platforms (10).
Mass spectrometry (MS) has provided a powerful means for obtaining peptide mass fingerprints for proteins resolved in 2-DE gels. Protein databases and reference maps have been built to catalog proteins resolved by 2-DE from various cell types both in healthy and diseased states. Specialized software then allows an investigator to compare 2-D gel patterns with one another and with reference maps on the Internet, allowing both quantitative and qualitative differences in protein profiles to be detected for biomarker identification. Using 2-DE, Ostergaard et al. (11) analyzed the proteome of 150 bladder tumors and observed a decline in the expression of specific cytokeratins, psoriasin, galectin 7, and stratifin in tumors with a low degree of differentiation. Furthermore, 2-DE analysis of urine exhibited the presence of psoriasin in patients with squamous cell carcinoma, which could be used as a biomarker for noninvasive follow up (12). Through 2-DE analysis of colorectal carcinomas and healthy colonic epithelial cells, Jungblut et al. (13) discovered a protein that was expressed exclusively in the tumor tissues. This low-molecular-weight protein, Calgranulin B, was expressed in dysplastic polyps from patients with colon carcinoma and ulcerative colitis (13). Although this protein seems to be highly specific for preneoplastic and neoplastic tissues, its role in carcinogenesis is unclear. Sarto et al. (14) constructed a map of healthy and renal cell carcinoma (RCC) proteins through 2-DE analysis of healthy and RCC kidney tissue, which led to the identification of ubiquinol cytochrome C reductase as a potential biomarker. In an early study of tumor progression and metastases in a rat model of mammary adenocarcinoma, using 2-DE, Welch et al. (15) found a protein, P50.9, that correlated inversely with the metastatic phenotype, suggesting quantitative and qualitative protein differences between tumor types. A comprehensive analysis of the proteome in lung cancer is being undertaken (16). The objectives of the study include the following: (a) identification of proteins that distinguish between different subtypes of lung cancer, including small cell and non-small cell lung cancer, that could be used for molecular diagnosis (Fig. 1
); (b) delineation of proteins that have prognostic significance and could predict outcomes among patients that are otherwise indistinguishable on the basis of current criteria; (c) identification of secreted proteins that are detectable in serum or biological fluids for screening and early diagnosis; (d) identification of antigenic proteins that induce a humoral response and could be used for screening through their detection as circulating antigens or through the detection of their corresponding autoantibodies; and (e) creating a database of protein expression in lung cancer (Fig. 2
).
|
|
In other studies, profiles from various grades of breast tumors have been obtained, which exhibited increases in proliferating cell nuclear antigen in invasive carcinomas (17)(18)(19). Distinct differences were observed in the protein expression profiles of fibroadenomas and invasive carcinomas. Two polypeptide markers, TAO1 and TAO2 (napsin), were found preferentially expressed in
90% of primary lung adenocarcinomas through the use of 2-DE (20)(21)(22). Applying 2-DE, Alaiya et al. (23) attempted to proteomically define ovarian tumors into benign, borderline, and malignant types. Malignant tumors exhibited increases in proliferating cell nuclear antigen, OP18, pHSP60, HSP90, and calreticulin and decreases in tropomyosin-1 and -2 when compared with benign tumors. The authors were able to discriminate between malignant and benign tumors, using a panel of nine proteins. With the use of 2-DE maps to distinguish between prostate carcinoma and benign prostatic hyperplasia, a similar pattern of protein expression as above was observed (24). In addition, malignant tumors displayed increased expression of oncoprotein 18(v), elongation factor-2, glutathione S-transferase p, superoxide dismutase, and triosephosphatase isomerase and decreased expression of cytokeratin 18. 2-DE maps of neuroblastoma indicated that the protein p19/nm23-H1 was expressed at increased concentrations compared with limited-stage disease (25). The expression of this protein was observed to be positively associated with metastases and large tumor mass.
ms
Unique ionization techniques, such as electrospray ionization and matrix-assisted laser-desorption ionization (MALDI), have facilitated the characterization of proteins by MS (26)(27)(28). These techniques have enabled the transfer of the proteins into the gas phase, making it conducive for their analysis in the mass spectrometer. Typically, sequence-specific proteases are used to break up the proteins into peptides that are coprecipitated with a light-absorbing matrix such as dihydroxy benzoic acid. The peptides are then subjected to short pulses of ultraviolet radiation under reduced pressure. Some of the peptides are ionized and accelerated in an electric field and subsequently turned back through an energy correction device (29). Peptide mass is derived through a time-of-flight (TOF) measurement of the elapsed time from acceleration-to-field free drift or through a quadrupole detector. A peptide mass map is generated with the sensitivity to detect molecules at a few parts per million. Hence a spectrum is generated with the molecular mass of individual peptides, which are used to search databases to find matching proteins. A minimum of three peptide molecular weights is necessary to minimize and prevent false-positive matches. The principle behind peptide mass mapping is the matching of experimentally generated peptides with those determined for each entry in a sequence. The alternative process of ionization, through the electrospray ionization, involves dispersion of the sample through a capillary device at high voltage (29). The charged peptides pass through a mass spectrometer under reduced pressure and are separated according to their mass-to-charge ratios through electric fields. After separation through 2-DE, digested peptide samples can be delivered to the mass spectrometer through a "nanoelectrospray" or directly from a liquid chromatography column (liquid chromatography-MS), allowing for real-time sequencing and identification of proteins. Recent developments have led to the MALDI quadrupole TOF instrument, which combines peptide mapping with peptide sequencing approach (30)(31)(32). An important feature of MS-MS analysis is the ability to accurately identify posttranslational modifications, such as phosphorylation and glycosylation, through the measurement of mass shifts. Using MALDI-TOF-MS of 2-DE separated proteins, Sarto et al. (33) identified multimeric isoforms of manganese superoxide dismutase expressed exclusively in RCCs. Interestingly, they also observed expression of isomeric forms of glutathione peroxidase only in healthy human kidney samples. Modified expression of these proteins renders them as potential markers in RCC because they also map to chromosome loci 5q21 and 6q21-6q27, which have been implicated in the oncogenesis of RCC (33). MS has also been helpful in the analysis of proteins from cancer tissues. Screening for the multiple forms of the molecular chaperone 14-3-3 protein in healthy breast epithelial cells and breast carcinomas yielded a potential marker for the noncancerous cells (34). The 14-3-3
form was observed to be strongly down-regulated in primary breast carcinomas and breast cancer cell lines relative to healthy breast epithelial cells. This finding, in light of the evidence that the gene for 14-3-3
was found silenced in breast cancer cells (35), implicates this protein as a tumor suppressor. Using a MALDI-MS system, Bergman et al. (9) detected increases in the expressions of nuclear matrix, redox, and cytoskeletal proteins in breast carcinoma relative to benign tumors. Fibroadenoma exhibited an increase in the oncogene product DJ-1. Retinoic acid-binding protein, carbohydrate-binding protein, and certain lipoproteins were increased in ovarian carcinoma, whereas Cathepsin D was increased in lung adenocarcinoma.
Imaging MS is a new technology for direct mapping and imaging of biomolecules present in tissue sections. For this system, frozen tissue sections or individual cells are mounted on a metal plate, coated with ultraviolet-absorbing matrix, and placed in the MS, and the specimen is processed as described previously for liquid samples (36). With the use of an optical scanning raster over the tissue specimen and measurement of the peak intensities over thousands of spots, MS images are generated at specific mass values (37). Stoeckli et al. (36) used imaging MS to examine protein expression in sections of human glioblastoma and found increased expression of several proteins in the proliferating area compared with healthy tissue. Liquid chromatographyMS and tandem MS (MS-MS) were used to identify thymosin ß.4, a 4964-Da protein found only in the outer proliferating zone of the tumor (36). Imaging MS shows potential for several applications, including biomarker discovery, biomarker tissue localization, understanding of the molecular complexities of tumor cells, and intraoperative assessment of surgical margins of tumors.
Surface-enhanced laser desorption-ionization (SELDI), originally described by Hutchens and Yip (38), overcomes many of the problems associated with sample preparations inherent with MALDI-MS. The underlying principle in SELDI is surface-enhanced affinity capture through the use of specific probe surfaces or chips. This protein biochip is the counterpart of the array technology in the genomic field and also forms the platform for Ciphergens ProteinChip® array SELDI MS system (32). A 2-DE analysis separation is not necessary for SELDI analysis because it can bind protein molecules on the basis of its defined chip surfaces. Chips with broad binding properties, including immobilized metal affinity capture, and with biochemically characterized surfaces, such as antibodies and receptors, form the core of SELDI (32). This MS technology enables both biomarker discovery and protein profiling directly from the sample source without preprocessing. Sample volumes can be scaled down to as low as 0.5 µL, an advantage in cases in which sample volume is limiting. Once captured on the SELDI protein biochip array, proteins are detected through the ionization-desorption, TOF-MS process. A retentate (proteins retained on the chip) map is generated in which the individual proteins are displayed as separate peaks on the basis of their mass and charge (m/z). Wright et al. (39) and Adam and coworkers (40)(41) demonstrated the utility of the ProteinChip SELDI-MS in identifying known markers of prostate cancer and in discovering potential markers either over- or underexpressed in prostate cancer cells and body fluids. SELDI analyses of cell lysates prepared from pure populations from microdissected surgical tissue specimens (Fig. 3
) revealed differentially expressed proteins in the cancer cell lysate when compared with healthy cell lysates and with benign prostatic hyperplasia (BPH) and prostate intraepithelial neoplasia cell lysates (39)(42). SELDI is a method that provides protein profiles or patterns in a short period of time from a small starting sample (39)(43), suggesting that molecular fingerprints may provide insights into changing protein expression from healthy to benign to premalignant lesions to malignant. This appears to be the case because distinct SELDI protein profiles for each cell and cancer type evaluated, including prostate, lung, and ovarian cancer, have been described recently (42)(43)(44)(45). After prefractionation, a SELDI profile of 30 dysregulated proteins was observed in seminal plasma from prostate cancer patients (40). One of the seminal plasma proteins detected by comparing the prostate cancer profiles with a BPH profile was identified as seminal basic protein, a proteolytic product of semenogelin I (32).
|
The ProteinChip SELDI platform can also be used as a high-throughput assay for a panel of markers in establishing protein fingerprints. By assessing a panel of urinary proteins, Vlahou et al. (46) were able to enhance the detection rate for low-grade bladder cancer to 75% relative to the 30% detected by traditional urine cytology. SELDI profiling coupled to a learning algorithm that compares combinations of the presence or absence of five seminal plasma protein species led to a sensitivity of 82% and specificity of 83% in differentiating prostate cancer from age-matched healthy men (41). Protein profiling of serum, in which the SELDI raw data are further analyzed by learning algorithms, has been able to achieve sensitivities and specificities of 92100% (Fig. 4A
) in discriminating prostate and ovarian cancer from age-matched healthy donors (41)(47). SELDI analysis of fractionated serum samples from breast cancer patients identified 15 protein peaks present in 75% of the cancer samples and not in 75% of the nondiseased samples tested. Three of the proteins were present in all stages of breast cancer and in none of the nondiseased samples (48).
|
The versatility of the ProteinChip® SELDI-MS system permits it not only to be used for discovery and protein profiling applications, but also as an immunoassay platform. For this application, antibody rather than a chemical matrix is bound to the chip array to capture the protein antigen. This format has been successfully used to develop both single and multiplex versions of the SELDI immunoassay for detection and measuring prostate-specific antigen (PSA) and prostate-specific membrane antigen in body fluids (39)(49)(50). Of particular significance was the observation that serum prostate-specific membrane antigen concentrations correctly discriminated all prostate cancer patients from BPH patients in the 410 ng/L PSA range where the specificity of PSA is only 2530% (49).
SELDI seems to overcome the problems of hydrophobicity, separation of highly acidic and highly alkaline proteins and membrane proteins, sensitivity for detecting low-molecular weight and low-abundance proteins, and conversion to a clinical assay, all of which are inherent problems with 2-DE.
protein and antibody microarrays
Arrays of peptides and proteins provide another biochip strategy for parallel protein analysis. Protein assays using ordered arrays have been explored through the development of multipin synthesis (51). Arrays of clones from phage-display libraries can be probed with antigen-coated filters for high-throughput antibody screening (52). Proteins covalently attached to glass slides through aldehyde-containing silane reagents have been used to detect protein-protein interactions, enzymatic targets, and protein-small molecule interactions (53). Other methods of generating protein microarrays are by printing the proteins (i.e., purified proteins, recombinant proteins, and crude mixtures) or antibodies using a robotic arrayer and a coated microscope slide in an ordered array. Protein solutions to be measured are labeled by covalent linkage of a fluorescent dye to the amino groups on the proteins (54). Protein arrays consisting of immobilized proteins from pure populations of microdissected cells have been used to identify and track cancer progression (55). Although protein arrays hold considerable promise for functional proteomics and expression profiling for monitoring a disease state, certain limitations need to be overcome. These include the development of high-throughput technologies to express and purify proteins, and the generation of large sets of well-characterized antibodies. Generating protein and antibody arrays is more costly and labor-intensive relative to DNA arrays. Nevertheless, the availability of large antibody arrays would enhance the discovery of differential biomarkers in nondiseased and cancer tissue. An excellent review of current promises and evolving technologies in protein-antibody arrays for medical applications can be found in the article by Cahill (56).
tissue microarrays
Tissue arrays have been developed for high-throughput molecular profiling of tumor specimens (57). Arrays are generated by robotic punching out of small cylinders (0.6 mm x 34 mm high) of tissue from thousands of individual tumor specimens embedded in paraffin and to array them in a paraffin block. Tissue from as many as 600 specimens can be represented in a single "master" paraffin block. By use of serial sections of the tissue array, tumors can be analyzed in parallel by immunohistochemistry, fluorescence in situ hybridization, and RNA-RNA in situ hybridization. Tissue arrays have applications in the simultaneous analysis of tumors from many different patients at different stages of disease. Disadvantages of this technique are that a single core is not representative because of tumor heterogeneity and uncertainty of antigen stability on long-term storage of the array. Hoos et al. (58) demonstrated that using triplicate cores per tumor led to lower numbers of lost cases and lower nonconcordance with typical full sections relative to one or two cores per tumor. Camp et al. (59) found no antigenic loss after storage of an array for 3 months. Validation of tissue microarrays is currently ongoing in breast and prostate cancers and will undoubtedly help in protein expression profiling (57)(59)(60). A major advantage of this technology is that expression profiles can be correlated with outcomes from large cohorts in a matter of few days.
bioinformatic issues
Bioinformatic tools are needed at all levels of proteomic analysis. The main databases serving as the targets for MS data searches are the expressed sequence tag and the protein sequence databases, which contain protein sequence information translated from DNA sequence data (29). It is thought that virtually any protein that can be detected on a 2-DE gel can be identified through the expressed sequence tag database, which contains over 2 million cDNA sequences (61). A modification of sequence-tag algorithms has been shown to locate peptides given the fact that the expressed sequence tags cover only a partial sequence of the protein (62).
Software packages and bioinformatic tools have been and are being developed to analyze 2-DE protein patterns. These software applications possess user-friendly interfaces that are incorporated with tools for linearization and merging of scanned images. The tools also help in segmentation and detection of protein spots on the images, matching, and editing (63). Additional features include pattern recognition capabilities and the ability to perform multivariate statistics. The handling and analysis of the type of data to be collected in proteomic investigations represent an emerging field (64). New techniques and new collaborations between computer scientists, biostatisticians, and biologists are called for. There is a need to develop and integrate database repositories for the various sources of data being collected, to develop tools for transforming raw primary data into forms suitable for public dissemination or formal data analysis, to obtain and develop user interfaces to store and retrieve and visualize data from databases, and to develop efficient and valid methods of data analysis. The sheer volume of data to be collected and processed will challenge the usual approaches to data handling and analysis.
Analyzing data of this dimension is a fairly new endeavor for statisticians, for which there is not an extensive technical statistical literature. The approach developed in the laboratory of S. H. is to store primary data in databases designed to record not only the amounts of each protein detected, but to also record sample identifying information and all pertinent data involved in the physical processing of the samples. Other information related to the proteins gathered from 2-DE gels or from other sources, such as MS profiles, is also stored. Data warehouse technology is used to improve efficiency and accuracy in accessing databases and to enhance the schema to be more flexible and comprehensive. Efforts to capture outside data conceptually involve five processes: (a) metamodeling, the metamodel contains information describing the model and definition of the source data elements; (b) modeling, the model contains a simple data structure with very little assumption among the relationships between data; (c) data retrieval, data are extracted and delivered to the data warehouse on the basis of a selected model; (d) data transformation, data are transformed on the basis of requirements; and (e) data transfer, processed data are transferred to the data warehouse. The data-retrieval, transformation, and transfer processes ideally should be done by automated techniques because the volume of data is so great and incremental loading of new data to the data warehouse is required on a periodic basis. Automated techniques include data mapping from the operational databases to the data warehouse by use of algorithms for data conversions, filtering, reformatting, referential integrity checking, indexing, ensuring consistency, and data quality management. A data warehouse provides advanced query functionality plus data analysis extensions that require unearthing the hidden relationships within the data.
There are several degrees of complexity in the investigation of data, from day-to-day interactions with the protein patterns of individual measurement systems to query and manipulation of data from multiple experiments and/or sources of information. Another level of user interrogation of data involves interaction with the data warehouse. Generally, users retrieve data and formulate queries to test hypotheses and establish conclusions. Formulating queries can be a difficult task, requiring extensive syntactic and semantic knowledge. Syntactic knowledge is needed to ensure that a query is well formed and references existing relationships and attributes. Semantic knowledge is needed to ensure that a query satisfies user intent. A user often has an incomplete understanding of the contents and structure of the data warehouse. Consequently, our work has to provide automated techniques for query formulation that substantially reduce the amount of knowledge required by data warehouse users.
2-DE-related protein databases are helping map proteins from specific cells and from different stages of tumor development (65)(66). Many of these are public domain, available through the World Wide Web on servers such ExPaSy and the world 2-DE page (67). Annotated protein databases such as SWISS-PROT and TrEMBL are fast becoming critical proteome resources (68). Such tools facilitate the analysis of posttranslational modifications and three-dimensional structure and physicochemical properties of identified proteins (2). These databases are becoming invaluable resources of protein maps from such tissues as breast and bladder transitional cell and squamous cell carcinomas (2).
Protein data derived from 2-DE analysis have been used to develop artificial learning models to help classify tumors into benign, borderline, and malignant (24). Statistical algorithms such as partial least squares and hierarchical clustering have been used to that effect. Schmid et al. (69) clustered lung tumor cell lines according to their sources of origin, as adenocarcinoma, small cell lung cancer, or mesothelioma through the use of correspondence analysis and hierarchical clustering algorithms (70).
Discrete Wavelet Transform (DWT) has been used in MS data for denoising and data compression (71). DWT is also useful for feature extraction in discrimination and classification problems for very high dimensionality data in molecular biology. In discriminate analysis, feature extraction seeks a transformation of the original data to a lower-dimension space. The most widely used technique is principal components analysis (PCA) (72). For a data set with very high dimensionality, the number of variables may be 100 times the sample size in the training set. Therefore, PCA and all transforms on the basis of sample variance-covariance matrices are not useful because the number of markers in the sample variance-covariance matrices is too high to be estimated. Like PCA, DWT also produces an orthogonal linear transformation of the original data to a reduced-dimension space, but it does not need the sample variance-covariance matrices in computation. This biostatistic computational approach was used to develop a learning algorithm to discriminate prostate cancer from noncancer on the basis of the ProteinChip SELDI serum-protein profiles. Each SELDI-TOF profile contains 28 672 data points for each serum sample. A training set of 248 samples, including 167 cancer cases and 81 controls was used to develop the algorithm. A DWT with Haar wavelet with the use of thresholding (73) yielded 3931 Haar wavelet coefficients, of which 16 coefficients or protein masses were selected as the classifier that separated cancer samples from nondiseased samples (Fig. 4B
). A test set consisting of serum from 15 age-matched healthy donors and 30 prostate cancer patients obtained a sensitivity of 99% and a specificity of 98% (41). Genetic clustering algorithms have also been applied successfully for analyzing SELDI profiling data (47).
Novel software programs are available to identify proteins from peptide mass fingerprinting output of MALD-TOF by incorporating a parametric multilevel scoring algorithm (74). MALDI-TOF spectra and environmental data such as species, isoelectric point, molecular weight, chemical modification, and so forth are used by the program to perform an automated identification of the protein. The two-step process incorporates a peak detection algorithm and an identification algorithm that search the protein sequence databases, SWISS-PROT, and TrEMBL for entries that match the input data.
future approaches in proteomics
The cellular proteome is a dynamic profile and is subject to changes in response to various signals and as a part of disease progression. This occurs through a interplay of posttranslational modifications, translocation, protein-protein interactions, and protein-nucleic acid interactions (75). An important aspect of cancer proteomics is the ability to target and analyze subsets of proteins. Protein subsets include those with a specific modification, the ability to bind a specific sequence or site, multimeric complexes, and so forth. In a parallel development there is a focus on subcellular proteomes to target cellular organelles. This is an attempt to reduce the complexity of the eukaryotic cell to ascertain meaningful information related to cancer biology. Specific compartments under scrutiny include proteome analysis of mitochondria, lysosomes, peroxisomes, endoplasmic reticulum, golgi apparatus, endocytic vesicles, and the nucleus (76).
Proteomics complements genomic-based approaches in the study of cancer. Proteins are the functional output of a cell and form an intrinsic part of its dynamic network. Their expression, activities, and location can be changed any time through posttranslation modifications. Studies indicate that up to six different protein forms per gene in humans (77), and understanding their functional status in nondiseased and various stages of disease progression will provide insights into designing strategies for prevention, diagnosis, and therapeutics. Cancer cells exhibit a large number of posttranslational modifications, which have important consequences in determining the phenotypic fate of the cell. Proteomics is essential in detecting those changes in protein profiles that can lead to a more comprehensive understanding of the disease process.
Given the promise of proteomic technologies, there are limitations that need to be overcome to increase the sensitivity and enhance information capture. The identification of low-abundance proteins may often be hindered by the highly abundant ones, which mask them. Important biomarker information could thus be lost because there is no PCR equivalent available to amplify proteins. Furthermore, given the complex nature of carcinogenesis and the heterogeneous nature of cellular interaction within the microenvironment of a tumor, analysis of appropriate cell population is necessary to obtain meaningful proteomic output and screen out background noise. This can be achieved by obtaining protein profiles from specific cell populations to avoid potential contamination from sources not related to the onset or progression of the cancer. To identify biomarkers helpful in risk assessment, detection or follow up of malignancy, the use of the appropriate cellular tissue is important. Laser capture microdissection has helped in sampling specific cell populations directly from tissue sections without causing any mechanical disruption while maintaining cell viability (78)(79). The procedure involves the use of a thermosensitive polymer film layered over the targeted cells. The film is then melted by an infrared laser, forming a solid composite with the target, and lifted from the rest of the tissue without disruption to the cellular dynamics or tissue morphology. Specific cell populations have also been isolated using the ultraviolet laser microscope system with pressure catapulting (80). Tissue mass around target cells is cut away, leaving a homogeneous population of cells. Flow cytometry has evolved into a powerful technique for distinguishing, quantifying, and sorting different types of cells. Optical measurements are used to identify unique structural features of a cell in distinguishing it from other cell types. Such methods of sample processing are helping analyze protein expression in premalignant lesions (81)(82)(83) important in capturing critical biomarker information for risk assessment and detection.
As new protein biomarkers are discovered through proteomic approaches, the necessity to validate and ultimately use them in a clinical setting increases. This can be done only as a collaborative effort between the research communities. The National Cancer Institute has taken a lead role in this regard by creating the Early Detection Research Network. This network brings together national and international experts from academia and industry to promote biomarker discovery and validation and help translation into clinical practice. Exciting information is expected to emerge from such a collaborative effort that can ultimately be applied to population screening for the early detection and diagnosis of cancer.
Footnotes
1 Nonstandard abbreviations: 2-DE, two-dimensional electrophoresis; MS, mass spectrometry; RCC, renal cell carcinoma; MALDI, matrix-assisted laser-desorption ionization; TOF, time of flight; SELDI, surface-enhanced laser desorption-ionization; BPH, benign prostatic hyperplasia; PSA, prostate-specific antigen; DWT, Discrete Wavelet Transform; and PCA, principal components analysis. ![]()
References
is down-regulated in human breast cancer cells. Cancer Res 2001;61:76-80.
locus leads to gene silencing in breast cancer. Proc Natl Acad Sci U S A 2000;97:6049-6054.The following articles in journals at HighWire Press have cited this article:
![]() |
H. K. Kim, W. S. Park, S. H. Kang, M. Warda, N. Kim, J.-H. Ko, A. E.-b. Prince, and J. Han Mitochondrial alterations in human gastric carcinoma cell line Am J Physiol Cell Physiol, August 1, 2007; 293(2): C761 - C771. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. F. Lopez, A. Mikulskis, S. Kuzdzal, E. Golenko, E. F. Petricoin III, L. A. Liotta, W. F. Patton, G. R. Whiteley, K. Rosenblatt, P. Gurnani, et al. A Novel, High-Throughput Workflow for Discovery and Identification of Serum Carrier Protein-Bound Peptide Biomarker Candidates in Ovarian Cancer Samples Clin. Chem., June 1, 2007; 53(6): 1067 - 1074. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Bloomston, J. X. Zhou, A. S. Rosemurgy, W. Frankel, C. A. Muro-Cacho, and T. J. Yeatman Fibrinogen {gamma} Overexpression in Pancreatic Cancer Identified by Large-scale Proteomic Analysis of Serum Samples. Cancer Res., March 1, 2006; 66(5): 2592 - 2599. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Le, K. Chi, S. Tyldesley, S. Flibotte, D. L. Diamond, M. A. Kuzyk, and M. D. Sadar Identification of Serum Amyloid A as a Biomarker to Distinguish Prostate Cancer Patients with Bone Lesions Clin. Chem., April 1, 2005; 51(4): 695 - 707. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. M. Koomen, L. N. Shih, K. R. Coombes, D. Li, L.-c. Xiao, I. J. Fidler, J. L. Abbruzzese, and R. Kobayashi Plasma Protein Profiling for Diagnosis of Pancreatic Cancer Reveals the Presence of Host Response Proteins Clin. Cancer Res., February 1, 2005; 11(3): 1110 - 1118. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. J. Mateos-Caceres, A. Garcia-Mendez, A. Lopez Farre, C. Macaya, A. Nunez, J. Gomez, S. Alonso-Orgaz, C. Carrasco, M. E. Burgos, R. de Andres, et al. Proteomic analysis of plasma from patients during an acute coronary syndrome J. Am. Coll. Cardiol., October 19, 2004; 44(8): 1578 - 1583. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Chen, G. Shi, W. Xia, C. Kong, S. Zhao, A. F. Gaw, E. Y. Chen, G. P. Yang, A. J. Giaccia, Q.-T. Le, et al. Identification of Hypoxia-Regulated Proteins in Head and Neck Cancer by Proteomic and Tissue Array Profiling Cancer Res., October 15, 2004; 64(20): 7302 - 7310. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Skates and O. Iliopoulos Molecular Markers for Early Detection of Renal Carcinoma: Investigative Approach Clin. Cancer Res., September 15, 2004; 10(18): 6296S - 6301S. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Kinpara, R. Mizuno, Y. Murakami, M. Kobayashi, S. Yamaura, Q. Hasan, Y. Morita, H. Nakano, T. Yamane, and E. Tamiya A Picoliter Chamber Array for Cell-Free Protein Synthesis J. Biochem., August 1, 2004; 136(2): 149 - 154. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Z. Kristiansen, J. Bunkenborg, M. Gronborg, H. Molina, P. J. Thuluvath, P. Argani, M. G. Goggins, A. Maitra, and A. Pandey A Proteomic Analysis of Human Bile Mol. Cell. Proteomics, July 1, 2004; 3(7): 715 - 728. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Soriano and J. D. Pardee M34 Actin Regulatory Protein Is a Sensitive Diagnostic Marker for Early- and Late-Stage Mammary Carcinomas Clin. Cancer Res., July 1, 2004; 10(13): 4437 - 4443. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. P. Diamandis Mass Spectrometry as a Diagnostic and a Cancer Biomarker Discovery Tool: Opportunities and Potential Limitations Mol. Cell. Proteomics, April 1, 2004; 3(4): 367 - 378. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Tomonaga, K. Matsushita, S. Yamaguchi, M. Oh-Ishi, Y. Kodera, T. Maeda, H. Shimada, T. Ochiai, and F. Nomura Identification of Altered Protein Expression and Post-Translational Modifications in Primary Colorectal Cancer by Using Agarose Two-Dimensional Gel Electrophoresis Clin. Cancer Res., March 15, 2004; 10(6): 2007 - 2014. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. P. Diamandis Analysis of Serum Proteomic Patterns for Early Cancer Diagnosis: Drawing Attention to Potential Problems J Natl Cancer Inst, March 3, 2004; 96(5): 353 - 356. [Full Text] [PDF] |
||||
![]() |
J. T. Wadsworth, K. D. Somers, L. H. Cazares, G. Malik, B.-L. Adam, B. C. Stack Jr., G. L. Wright Jr., and O. J. Semmes Serum Protein Profiles to Identify Head and Neck Cancer Clin. Cancer Res., March 1, 2004; 10(5): 1625 - 1632. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. T. Wadsworth, K. D. Somers, B. C. Stack Jr, L. Cazares, G. Malik, B.-L. Adam, G. L. Wright Jr, and O. J. Semmes Identification of Patients With Head and Neck Cancer Using Serum Protein Profiles Arch Otolaryngol Head Neck Surg, January 1, 2004; 130(1): 98 - 104. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. P. Diamandis Proteomic Patterns in Biological Fluids: Do They Represent the Future of Cancer Diagnostics? Clin. Chem., August 1, 2003; 49(8): 1272 - 1275. [Full Text] [PDF] |
||||
![]() |
L. G. Sheffield and J. J. Gavinski Proteomics Methods for Probing Molecular Mechanisms in Signal Transduction J Dairy Sci, July 1, 2003; 86(13_suppl): E115 - 124. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. C.W. Poon, T.-T. Yip, A. T.C. Chan, C. Yip, V. Yip, T. S.K. Mok, C. C.Y. Lee, T. W.T. Leung, S. K.W. Ho, and P. J. Johnson Comprehensive Proteomic Profiling Identifies Serum Proteomic Signatures for Detection of Hepatocellular Carcinoma and Its Subtypes Clin. Chem., May 1, 2003; 49(5): 752 - 760. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. G. Sheffield and J. J. Gavinski Proteomics methods for probing molecular mechanisms in signal transduction J Anim Sci, March 1, 2003; 81(suppl_3): 48 - 57. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Sasaki, K. Sato, Y. Akiyama, K. Yanagihara, M. Oka, and K. Yamaguchi Peptidomics-based Approach Reveals the Secretion of the 29-Residue COOH-Terminal Fragment of the Putative Tumor Suppressor Protein DMBT1 from Pancreatic Adenocarcinoma Cell Lines Cancer Res., September 1, 2002; 62(17): 4894 - 4898. [Abstract] [Full Text] [PDF] |
||||
![]() |
B.-L. Adam, Y. Qu, J. W. Davis, M. D. Ward, M. A. Clements, L. H. Cazares, O. J. Semmes, P. F. Schellhammer, Y. Yasui, Z. Feng, et al. Serum Protein Fingerprinting Coupled with a Pattern-matching Algorithm Distinguishes Prostate Cancer from Benign Prostate Hyperplasia and Healthy Men Cancer Res., July 1, 2002; 62(13): 3609 - 3614. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A. Macoska The Progressing Clinical Utility of DNA Microarrays CA Cancer J Clin, January 1, 2002; 52(1): 50 - 59. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |