|
|
||||||||
Laboratory Management |
European Molecular Genetics Quality Network, National Genetics Reference Laboratory, St. Marys Hospital, Manchester, United Kingdom.
aAddress correspondence to this author at: European Molecular Genetics Quality Network (EMQN), c/o National Genetics Reference Laboratory, St. Marys Hospital, Hathersage Road, Manchester M13 0JH, United Kingdom. Fax 44-161-276-6606; e-mail simon.patton{at}cmmc.nhs.uk.
| Abstract |
|---|
|
|
|---|
Methods: Current practice for DNA sequence analysis was established by use of an online questionnaire. Participating laboratories were provided with 4 DNA samples of validated genotype. Evaluation of the results included assessing the quality of sequence data, variant genotypes, and mutation nomenclature. To accommodate variations in mutation nomenclature, variants indicated by participants were scored for compliance with 3 acceptable marking schemes.
Results: A total of 346 genotypes were analyzed. Of these, 19 (5%) genotyping errors were made. Of these, 10 (53%) were false-negative and 9 (47%) were false-positive results. A further 27 (8%) errors were made in naming mutations. Results were analyzed for 3 indicators of data quality: PHRED quality scores, Quality Read Length, and Quality Read Overlap. Most laboratories produced results of acceptable diagnostic quality as judged by these indicators. The results were used to calculate a consensus benchmark for DNA sequencing against which individual laboratories could rank their performance.
Conclusions: We propose that the consensus benchmark can be used as a baseline against which the aggregate and individual laboratory standard of DNA sequencing may be tracked from year to year.
| Introduction |
|---|
|
|
|---|
The EMQN was formed in 1997 in response to concerns about the lack of provision of EQA schemes for diagnostic molecular genetics laboratories, to improve and harmonize EQA for molecular genetic testing in Europe. From 1999 to 2002, EMQN received funding from the European Commission under the Framework IV program for Standards, Measurement and Testing (contract no. SMT4-CT98-7515). Since then, it has been supported through subscription of the participants, and has been coordinated from the National Genetics Reference Laboratory at Manchester, United Kingdom (7). Recently, the European Commissions Joint Research Centre highlighted the need for improved quality assurance and harmonization of genetic testing services in the European Union (8). To date, most EQA schemes for molecular genetics have focused on gene-specific targets relevant to clinical applications such as inherited breast cancer (9), cystic fibrosis (10)(11), familial thrombophilia (12), and hepatitis B testing(13). Laboratory performance has been assessed on the basis of the ability to accurately detect genotypes. Some EQA schemes related to clinical testing have also sought to assess clinical interpretation by use of expert opinion and consensus guidelines such as those provided by EMQN (14) and the European Academy for Andrology (15). Although there is a strong demand to add additional gene-specific schemes, the number of gene targets relevant to clinical testing, >1000 reported to date (8), place a practical limit on the coverage of EQA in this format. In response, EQA scheme providers have begun to develop generic technical schemes that are widely applicable to clinical, core technology, and other testing laboratories (16)(17)(18)(19)(20). DNA sequencing is an example of a widely used analytical technology in many biological laboratories. To address this need, the EMQN has developed an EQA scheme for the evaluation of DNA sequence analysis. The first pilot scheme ran in 2002 and involved 34 laboratories from 15 countries (21). Here we report the results of the second international survey of the quality of DNA sequencing among 64 laboratories from 21 countries. The EQA scheme assessed laboratory performance in DNA sequencing through the ability to detect variants and also assessed raw DNA sequence data, using standard statistical measures.
| Materials and Methods |
|---|
|
|
|---|
samples
The scheme organizer provided 450-bp PCR-amplified fragments of exon 10 of the cystic fibrosis transmembrane conductance regulator (CFTR) gene (GenBank accession no. M55115). The 4 test samples included all of the main types of sequence changes laboratories were likely to encounter. In addition, a stated wild-type control was provided. Samples were prepared by amplification using primer pair 1 (see below). To accommodate laboratories with sequencing strategies using universal primers, samples tagged with M13 tails (primer pair 2 or 3) were also provided. The primer sequences were as follows:
The amplification conditions were as follows: initial denaturation for 5 min at 95 °C, followed by 35 cycles of 94 °C for 1 min, 55 °C for 1 min, and 72 °C for 1 min, with a final 5-min extension at 72 °C. The reaction conditions were 20 ng of DNA, 1x CM102 Reddyload PCR mixture (Abgene), and 2 µM each of the forward and reverse primers in a total volume of 40 µL. Each sample was purified free of excess primers by use of a QuickStepTM Kit (Edge Biosystems).
validation
Before distribution, sample genotypes were validated by the scheme organizer and by an independent laboratory. All variant genotypes were concordant between the 2 centers.
distribution
EQA samples for distribution were aliquoted by use of a liquid-handling robot (Tecan Miniprep 75) into prelabeled 0.6-mL screw-cap tubes. Each sample tube contained 20 µL of DNA at a concentration of 35 ng/µL. The primer tubes contained 20 µL of either the forward or reverse primer at a concentration of 5 µM. Each participating laboratory received a package containing the scheme materials and instructions sent by next-day courier.
instructions
Participants were provided with a datasheet including a reference sequence, the protein translation, and location of primer sequences. Laboratories were asked to return a preprinted proforma with their genotyping results, along with color copies of the sequencing electropherograms and electronic copies of the sequence data files.
evaluation
Evaluation of the results included assessment of the genotypes and the quality of sequence data. To accommodate variations in the mutation nomenclature used, 3 marking schemes were devised (for an example, see Table 1
). Correct genotypes were assigned a maximum of 2.00 marks. Incorrect data led to deductions from this score, with a completely wrong genotype scoring 0.00 marks. A score was awarded for 3 criteria (correct assignment of the nucleotide change, the systematic name, and the mutation effect) for each allele. The total score for each correct genotype was 3.00 marks. The maximum possible score for the scheme was 12.00 marks.
|
The quality of the sequence data provided by participants was also assessed, and laboratory performance was ranked. The following measures were used.
PHRED quality scores.
Each sequence data file was analyzed by use of a PHRED basecaller, which we used in conjunction with the InterPhace program (available from CodonCode Corporation) (22). PHRED assigns a quality score for each called base on a logarithmic scale. The PHRED quality score estimates the confidence with which each base has been called. A PHRED score of 20 equates to 99% confidence (1% probability of error). We measured the quality of the data by calculating the number of bases that equaled or exceeded 3 thresholds: PHRED 20, 30, and 40 (99%, 99.9%, and 99.99% confidence, respectively).
Quality Read Length (QRL).
Sequence data files were assembled into contigs by use of Pregap4, a contig assembly program available as part of the Staden analysis package (23). Sequencing data at the beginning and end of each read were quality-clipped wherever a window of 30 bases dropped below a mean PHRED score of 20. The proportion of unclipped data was expressed as a percentage of the total possible read for the fragment, i.e., the number of bases from the end of the sequencing primer to the end of the fragment. The QRL is a measure of the proportion of each sequence read in a single orientation comprising good analytical quality data (minimum base-calling confidence of 99%).
Quality Read Overlap (QRO).
The QRL scores for each sample (from both forward and reverse sequence reads) were combined together to yield a region of data, the QRO, where good diagnostic quality data were obtained in both orientations. The QRO is expressed as a percentage of the maximum possible QRO, which is the number of nucleotides between the ends of the forward and reverse sequencing primers on the target DNA. The QRO represents the data from each pair of reads that can be considered as "reportable quality".
| Results |
|---|
|
|
|---|
|
genotyping
All laboratories obtained results from the materials provided. Thirty-six (59%) of the participants scored the maximum 12.00 marks for genotyping when measured against the marking schemes (Table 1
). Of the 28 (41%) laboratories that failed to obtain maximum marks, 19 (5%) genotyping errors were made from 346 genotypes analyzed; 10 of the genotyping errors (53%) were false-negative results and 9 (47%) were false-positive results. In one case, a laboratory recorded a false-positive mutation that was not reported by any other laboratory. A further 27 errors were made in the naming of the mutations (Table 2
). When a laboratory had miscalled or missed an allele, we examined their electronic sequence data to establish whether the change was identifiable. In all cases, we could see the expected change in the electronic data provided by the laboratory. A genotype score was calculated as the sum of the total score divided by the maximum possible score (Fig. 2
; also see Fig. 1 in the online Data Supplement). Seventeen (28%) laboratories had a score that fell below the mean (0.94).
|
|
quality of sequence data
The output from 57 (93%) of the participating laboratories was analyzed. Data from 4 laboratories were excluded because the format used was incompatible with the analysis programs.
The majority of sequence data exceeded a PHRED 20 quality score (Fig. 3A
; also see Fig. 2A in the online Data Supplement). The proportion of base-calls meeting this standard averaged over all centers was 87.49% (range, 67.45%95.89%). This was mirrored by the data for the PHRED 30 (Fig. 3B
; also see Fig. 2B in the online Data Supplement) and PHRED 40 (Fig. 3C
; also see Fig. 2C in the online Data Supplement) analyses, although a lower proportion of data produced by the centers exceeded these thresholds [mean scores of 79.77% (range, 46.28%92.38%) and 63.78% (range, 15.07%83.72%), respectively].
|
The distribution of QRL and QRO scores expressed as a percentage (Figs. 4
and 5
; also see Figs. 3 and 4 in the online Data Supplement) broadly mirrored the distributions of the PHRED scores (mean score, 86.09%; range, 57.48%99.78%). However, there was a more pronounced drop off at the lower end of the distribution. Data from this subset of laboratories were generally of lesser quality in both sequence orientations. Laboratories with PHRED 40 scores above the mean value tended to have good QRL and QRO scores (for example, data from laboratories 99 and 98). However, there were some exceptions to this correlation as some laboratories with poorer quality data as measured by the PHRED score thresholds had good QRL and QRO scores. For example, as can be seen in Fig. 3
(also see Fig. 2 in the online Data Supplement for Laboratory ID codes), the quality of the sequence data from laboratory 34 as measured by PHRED was below average. However, as can be seen in Fig. 5
(see also Fig. 4 in the online Data Supplement), the QRO from this same laboratory was above average, indicating that the data must be of even quality in both orientations. To resolve this apparent anomaly, we contacted the laboratory directly. The combination of the chemistry, polymer, and capillary length they use is optimized to allow them to rapidly run several different types of assays at the same time. Although their protocol is not optimized to give "perfect" sequence data, it does provide consistent quality data with long read lengths in both orientations. Overall, the trend was for laboratories that produce low PHRED quality scores to also have low scores for QRL and QRO.
|
|
To allow laboratories to gauge their performance, we established a consensus standard for the quality of the sequencing data by ranking the performance of each participant against the %QRO (Fig. 6
). To ensure that the data will be comparable in successive EQA schemes and to allow laboratories to determine their performance relative to their peers, we normalized the ranking on a centile scale of 0100 and established decile performance indicators (D1 to D10). Including an identical fragment in successive EQA schemes will allow tracking of the aggregate standard of DNA sequencing among the participants.
|
In Fig. 6
, laboratories that made genotyping errors are highlighted by white (false-negative), black (false-positive), and black-and-white-shaded (false-positive and -negative) arrows. The spread of these laboratories across the ranked data set suggests that it is not the quality of data that leads to genotype errors. The most likely explanation is human error, failure of mutation scanning software, or a combination of both because many laboratories use a manual check after an initial pass through a software system. In all cases involving false negatives, the assessors were able to identify the change from the raw data.
| Discussion |
|---|
|
|
|---|
When new reports describe variants according to current recommendations, instead of using traditional descriptions, experts in the field experience problems "recognizing" these variants. However, nomenclature rules should be universal and thus can not be made to apply for specific situations.
We noticed a common error in mutation naming whereby laboratories incorrectly prefixed the mutation with a "c." instead of a "g." when classifying according to a genomic sequence. We did not deduct marks for this but recommended that laboratories take greater care when naming any changes they find in a DNA sequence. Twenty-seven (59%) of the errors identified were errors in nomenclature, indicating that this is an evolving and poorly standardized area. The remaining 19 (41%) errors were false-positive or -negative observations.
There appears to be no correlation between the quality of sequence data and the likelihood of a laboratory making a genotyping mistake (Fig. 6
). Identifying mutations is made difficult by the high information content of sequence data. The feature that sets DNA sequencing apart from other mutation scanning systems is the large number of data points to be analyzed. Electronic DNA sequencing data can be analyzed for mutations visually or by use of software. Visual inspection is generally accepted as being the "gold standard", but the mental effort needed means that only small amounts of data can be reliably analyzed. Many software packages have been designed for research applications, where false negatives are not as critical. The performance of automated systems, for example, SeqScape (26) and Mutation Surveyor (27), falls short of the needs of analytical laboratories. Laboratories should ensure that any software that they use to analyze sequence data is adequately validated for the intended end use of the data, especially in a clinical setting.
This EQA scheme measures quality and provides laboratories and core sequencing facilities with an indicator of performance. The lowest mean PHRED 20 score in this series (67.45%) indicated that for the 450-bp fragment sequenced, on average, 303 bp (67.45% x 450 bp) of the sequence had a base-calling error rate
1%. Therefore, by definition, 147 bp (32.55%) of that sequence had an error rate
1%. Given that more than 100 bp of data had been called at this level of error, it is highly likely that these data will contain base-calling errors. Assuming an error rate of 1% for the 147-bp poor-quality portion of the read, the likelihood of making at least one mistake in base-calling is 77.2% [probability calculated as 1 (0.99147) x 100]. Sequence quality underlies base-calling accuracy, although base-calling errors need not necessarily lead to variant detection error. Normalizing the QRO ranking on a centile scale and using decile performance indicators allows laboratories to measure the improvement or deterioration of their performance relative to their peers in successive EQA schemes. The inclusion of one material common to successive schemes will allow aggregate performance to be followed (even if different sets of laboratories participate from scheme to scheme). These measures are suggested as a benchmark for laboratory performance in the analysis of DNA sequence.
| Acknowledgments |
|---|
| Footnotes |
|---|
| References |
|---|
|
|
|---|
The following articles in journals at HighWire Press have cited this article:
![]() |
B. Afshar, N. K. Fry, W. Bellamy, A. P. Underwood, T. G. Harrison, and and Members of the European Working Group for Legi External Quality Assessment of a DNA Sequence-Based Scheme for Epidemiological Typing of Legionella pneumophila by an International Network of Laboratories J. Clin. Microbiol., October 1, 2007; 45(10): 3251 - 3256. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Orlando, P. Verderio, R. Maatman, J. Danneberg, S. Ramsden, M. Neumaier, D. Taruscio, V. Falbo, R. Jansen, C. Casini-Raggi, et al. EQUAL-qual: A European Program for External Quality Assessment of Genomic DNA Extraction and PCR Amplification Clin. Chem., July 1, 2007; 53(7): 1349 - 1357. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Sonego, M. Pacurar, S. Dhir, A. Kertesz-Farkas, A. Kocsor, Z. Gaspari, J. A.M. Leunissen, and S. Pongor A Protein Classification Benchmark collection for machine learning Nucleic Acids Res., January 12, 2007; 35(suppl_1): D232 - D236. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Bakker Is the DNA sequence the gold standard in genetic testing? Quality of molecular genetic tests assessed. Clin. Chem., April 1, 2006; 52(4): 557 - 558. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |