The consensus sequences determined by Phrap are viewed in another program called Consed. This program makes it easy to find bases with low Phred scores.
Phred scores are denoted with upper-case (high quality) or lower-case (low quality) letters. More precise scoring is highlighted with a background color gradient from white to black, white being high quality. Mismatches with the consensus are highlighted in red, and inserted bases are noted with an asterisk.
Consed also allows you to compare electropherogram traces from different lanes. Reviewing the trace files can reveal hints of the cause of low Phred scores.
In addition to the quality of the assembly, the possibility of contamination of the source DNA is checked. Since every organism has a unique percentage of its genome that is made up of Gs and Cs (G+C content), contamination can be identified by plotting this distribution.
Suspicious G+C plots are then verified by performing a BLAST search. This program, available through NCBI, is used to compare JGI sequences to the known sequences of other organisms. Hits to closely related organisms validate the source DNA, whereas hits to distantly related organisms may represent contamination.
Once the draft assembly has passed quality assessment, the sequence is submitted to NCBI’s GenBank for public use.