Interpreting Chromatograms

Interpreting your Chromatogram:

In general, the least reliable areas of sequence are at the very beginning and towards the end. Below is an example of how a well-prepared, accurately quantitated DNA sample might sequence. Check in our Troubleshooting Your Data section for some examples of what less than optimal data might look like.

Sequence data at the beginning (Image One):

This is an example of a chromatogram from our pGEM control included on every run. Note that the first ten peaks or so are somewhat jumbled, rounded and not too well resolved. There’s also a little bit of unincorporated dye, but not so much that it becomes too interfering. The peak height is also somewhat smaller at the beginning. Notice that the two A’s at bp 19-20 are not well separated. All of this is normal, but reinforces the suggestion to place your primer about 40-50 base pairs away from your sequence of interest so as to avoid these issues. Also, keep in mind that your primer will NOT be visible in your sequence data - the primer is not labeled, but the first nucleotide that extends off from it is. So, theoretically, the first nucleotide in your sequence should be the first nucleotide after your primer. However, as you can see from the example above, even with our ultrapure and accurate control the earliest data is not very reliable so you shouldn’t expect to be able to confidently analyze your sequence from base#1.

Sequence data towards the middle (Image Two):

Notice how well defined and sharp the peaks are, with nice even spacing between them. Peak height is also significantly higher than the earliest fragments. There is little or no background interference present at the peak baselines. This sequence would be considered 100% accurate, from a visual perspective.

Sequence towards the end of the sequence (Image Three):

Now you can see where the resolution is beginning to deteriorate. The peaks are no longer sharp but have become broad and more rounded in shape, especially where you have more than one of the same nucleotide in a row. The peak height is also reduced. While this region has still been basecalled with 100% accuracy (this control, in fact, was 99% accurate out to base #920), it can be difficult to manually interpret, and it would definitely not be advisable to design a primer from a region like this. Sequence at the end like this might be useful when looking for sequence motifs or trying to determine exon-intron boundaries, but probably nothing much more specific than that.