Common Sequencing Problems
Suggestions for generating high quality DNA sequencing data:
- DNA quality is very important, make sure the isolated DNA is free of contamination such as buffer, salt, protein, polymer, etc. Submit purified DNA in water for sequencing.
- DNA quantity is very important, make sure sufficient amount of DNA is submitted for sequencing. O.D. reading tends to give false high concentration, use agarose gel electrophoresis to estimate DNA concentration.
- For PCR product, please avoid polyA, polyT, di- and tri- nucleotide repeats in the region to be sequenced. There should be a minimum of 30 - 50 bp between the 3' of the sequencing primer and the start point where accurate read is needed. Also try to avoid PCR product that is less than 250 bp for sequencing as it can easily saturate the instrument detection limit as well as causing mobility shift.
- Garbage in and garbage out! We use capillary electrophoresis for sequencing product separation and analysis. Any contamination that is present in the DNA and primer will end up in the final sequencing product, as it is loaded so is the contamination! Excess contamination can suppress the loading of the sequencing product causing low signal as well as a host of other ill effects. So please make efforts to generate high quality DNA and sufficient amount of DNA for sequencing.
Common DNA sequencing problems:
The sequencing reaction didn't work as shown in the following example:
Pay attention to the signal to noise ratio (S/N) on top left corner, a low value of S/N indicates that the sequencing reaction didn't work. There are several possible causes: the template doesn't have the primer binding site; the primer doesn't bind to the template; either the concentration of the template or the primer are way below optimal; or there are excess contamination either in the template or primer such as excess EDTA interfering with DNA sequencing reaction.
The sequencing reaction gave low signal as shown in the following example:
The S/N value is slightly higher. Due to the high background noise, a lot of base calls cannot be made with certainty and the read length is short. The problem may be due to low DNA or primer concentration or excess contamination in the template or primer.
The sequencing signal gradually dropped off as shown in the example below:
In the beginning the signal is strong but it gradually tails off giving a short read length. It could be caused by excess amount of template DNA or primer; excess contamination in
the template or primer; some kind of secondary structure in the template.
Excess dye terminator:
As shown below the large red
and blue peaks are due to excess dye terminators from the sequencing
reagents that are not completely removed during sequencing product
purification process. This is due to poor ethanol wash in the
purification process. It indicates that due to excess contamination
either in the template or primer, the efficiency of the sequencing reaction
is compromised resulting in large amount of excess sequencing reagents.
Poly T, poly A and di-nucleotide repeat. The example below is a typical sequencing pattern after polyT, polyA, and di-nucleotide repeat:
It affects PCR product sequencing more than plasmid. Therefore, keep in mind that after a repeat structure (polyA, polyT, di- or tri- nucleotide repeats) the sequencing signal will be unreadable. If you are sequencing on a template and run into a long stretch of T's or A's and the sequences become unreadable after the T or A run, we have developed two primers that can solve the problem. We have a T19V primer: TTT TTT TTT TTT TTT TTT TV (V = G, C, A), that can clean up the mess caused by a T run. (Actually we have all three primers, T19A, T19C and T19G, you can request either the specific one or the combination of all three). We also have a A19B primer: AAA AAA AAA AAA AAA AAA AB (B = G, T, C) (as you might have guessed, we have all three primers A19C, A19G and A19T), that can solve the noisy data caused by a A run.
Template that contains secondary structures such as GC rich area, inverted repeat, direct repeat and hairpins can cause the sequencing reaction to fail at the structure, unable to read through the structure as evidenced in the example below. If the secondary structure is not very strong, there are two remedies that sometimes work: use dGTP kit, this kit causes GC compression but sometimes can read through some kind of secondary structure such as weak hairpins; use SequenceRx Enhancer Solution A, this buffer helps to sequence through some secondary structures, but there is no guarantee and if it does read through it also causes compression in the area that you are interested. If you want to try these, please use the traditional format and write down your request.
Another example of secondary structure causing sudden drop of signal. The plasmid was sequenced on both strands. Panels 1 and 2 were sequenced with reverse primer (panel 1 used dGTP kit (a special kit that is supposed to read through difficult regions) while that of panel 2 was normal kit). Panels 3 and 4 showed sequencing with forward primer (panel 3 was normal kit, panel 4 used dGTP kit). Due to the presence of secondary structure (in this case, an inverted repeat region with the sequence TTTGCGGCCGAATTCGGCCGCAAA ), sequencing signal dropped suddenly at the site of secondary structure with both forward and reverse primers (panels 2 and 3). When dGTP kit was used, it sequenced through, but the repeat region is unreadable (in this case, since the repeat region is short, it is barely readable by comparing all 4 reads).
In this case, even though the S/N is good, the sequence is unreadable as shown in the example below. This is caused by a few specific reasons: template DNA is a mixture; template DNA has two or more binding site for the primer; primer is a mixture.
Loss of resolution:
When the samples are very pure and a lot are used
in the sequencing rxn, they can generate huge amount of sequencing signal
to overload the system so that the detection limit of the instrument is
exceeded, this will cause unreadable base calls as shown in the two examples
below, notice that in both cases the S/Ns are well above 1500.