Troubleshooting Sanger sequencing data

Troubleshooting Sanger sequencing data
USER BULLETIN
Troubleshooting Sanger sequencing data
Publication Number MAN0014435
■
■
■
■
■
■
■
■
■
■
■
■
■
Revision A.0
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Before you begin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Review your data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Recommended raw signal ranges and signal-to-noise ratio for minor
variant detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
Common sources of Sanger Sequencing noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Signal saturation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
Low signal intensity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
Dye blobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
G/C compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
G/C dye terminator degradation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Mixed sequence content overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Sequence analysis tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Related documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Introduction
Effective minor variant detection with Minor Variant Finder Software requires high
quality data with minimal noise.
While standard Sanger sequencing data is generally of high quality, the precision of
detecting minor variants and the nuances of high quality data traces, specifically
sources of baseline noise, become more important for an application of this nature.
This document provides guidance for the review of your data and troubleshooting
tips for improving sequencing data quality, if needed.
Instructions and examples in this guide were generated with Applied Biosystems™
Sequence Scanner Software, available for free download following on-line
registration. See “Before you begin“ on page 2.
For Research Use Only. Not for use in diagnostic procedures.
Troubleshooting Sanger sequencing data
Before you begin
Before you begin
Download software that allows you to visualize and evaluate your
electropherograms. We recommend one of the following:
• Sequence Scanner Software.
A free download of Sequence Scanner Software can be obtained at: http://
resource.thermofisher.com/pages/WE28396/, following registration at Thermo
Fisher Scientific.
• Minor Variant Finder Software.
Minor Variant Finder Software is available for purchase at:
www.thermofisher.com/mvf.
Review your data
To review the quality of your data or to troubleshoot sequencing issues:
1. Import the .ab1 files into Sequence Scanner Software.
2. Open files in Trace Manager by selecting View4Thumbnails.
3. Set Y Scaling to Individual to look for major quality issues (see Figure 1).
Alternatively, set Y Scaling to Uniform to compare an entire data set for relative
signal strength.
4. Examine traces for potential quality issues and raw signal variability among
samples.
1
2
3
4
5
7
6
Figure 1 Thumbnail view of imported .ab1 files
1
2
3
4
5
6
7
2
Major signal saturation (most peaks are above the recommended signal range)
Signal is within the recommended signal range and showing a normal reptation peak.
Reptation peaks are large peaks observed at the end of long runs.
Minor signal saturation (a few peaks are above the recommended signal range)
Within the recommended signal range
Low signal intensity
Dye blob
View Details icon
User Bulletin: Troubleshooting Sanger sequencing data
Troubleshooting Sanger sequencing data
Review your data
5. Click the View Details icon at the top right corner, then confirm that the
instrument, polymer, and dye set match the experimental set-up.
Note: If you have not used the correct mobility file, you can re-basecall the data
with the correct mobility file using Sequencing Analysis Software. Sequencing
Analysis Software is available for purchase at: http://www.thermofisher.com/
order/catalog/product/4474950.
The mobility file name is formatted as KB_instrument_polymer_dye set.mob.
Figure 2 Mobility file format
6. Double-click a thumbnail or trace file name to open the electropherogram.
User Bulletin: Troubleshooting Sanger sequencing data
3
Troubleshooting Sanger sequencing data
Review your data
7. Review the data in the following tabs:
• Analyzed – Review for issues such as dye blobs, primer dimers, mixed
sequence content, peak compressions, and G/C degradation.
• Raw – Review for issues such as signal saturation or low signal, dye blobs,
and primer dimers.
• Analyzed + Raw – Review for pull-up peaks. Pull-up peaks can be caused by
off-scale peaks visible in the raw data and can cause spurious secondary
peaks in the analyzed data.
Note: This tab is useful in determining the impact pull-up peaks have on
the analyzed data.
• Annotation – Review parameters set during data collection or while setting
up the run. Information is provided on: trace identification, data analysis,
instrument and data collection software and the run configuration used.
Note: Use this tab if poor peak spacing or mobility issues are suspected to
identify parameters set during data collection or setting up the run.
• EPT – Review for abnormal fluctuations in power, temperature, or voltage.
1
2
3
4
5
6
Figure 3 Parts of the screen
1
2
3
4
5
6
4
Icon to return to thumbnail view.
Analyzed tab
Raw tab
Analyzed + Raw tab
Annotation tab
EPT tab
User Bulletin: Troubleshooting Sanger sequencing data
Troubleshooting Sanger sequencing data
Recommended raw signal ranges and signal-to-noise ratio for minor variant detection
Recommended raw signal ranges and signal-to-noise ratio for
minor variant detection
Recommended raw signal range in relative
fluorescent units (RFU)
Instrument
Lower limit
Upper limit
(annotation
tab average)
Upper limit
(individual
peaks)
3130/3130xl
1,000
3,000
7,900
3500/3500xl
1,000
10,000
30,000
3730/3730xl
1,000
10,000
26,000
[1]
Recommended
signal‑to‑noise
ratio
>150
Recommended
Trace score [1] peak under peak
(PUP) value [1]
>40
>20
Trace score and PUP values are metrics used in Minor Variant Finder Software.
Figure 4 is an example electropherogram within the recommended raw signal range
and signal-to-noise ratio. High-quality data, within the ideal raw signal range, allows
for minor variant detection above baseline (background) noise. A 5% variant in a trace
with an average 3,350 RFU should produce a variant (at ~170RFU) that is
distinguishable from the system noise in high-quality sequencing data.
Figure 4 Visual minor variant detection in an electropherogram of high‑quality Sanger
sequencing data
In this example, the average 4‑color raw signal across the entire electropherogram is ~
3,350 RFU. The average raw signal-to-noise value is ~ 1,300. The arrow indicates the potential
variant in the raw data in the upper trace and the corresponding basecalled data in the lower
trace.
User Bulletin: Troubleshooting Sanger sequencing data
5
Troubleshooting Sanger sequencing data
Common sources of Sanger Sequencing noise
Common sources of Sanger Sequencing noise
Common sources of noise
How to recognize the source
Figure Number(s)
Signal saturation
The raw signal exceeds the recommended maximum RFU.
Figure 1 (Thumbnail
#1), Figure 5,
Figure 6, and
Figure 7
Note: Excessive raw signal causes pull-up peaks in the
analyzed data, which can incorrectly be identified as mixed
bases.
Low signal intensity
The raw signal is below the recommended minimum RFU.
Figure 1 (Thumbnail
#5), Figure 8 and
Figure 9
Dye blobs
Large broad peak normally seen at 85–90 bp or 125–130 bp.
Figure 1 (Thumbnail
#6) and Figure 10
G/C compression
Subtle G or C peak shoulders or unresolvable GC-rich regions.
Figure 11 and
Figure 12
G/C degradation
Decreased signal, increased baseline noise, and minor n+1
secondary peaks.
Figure 13
Primer impurity
Secondary peaks throughout the trace.
Figure 14
Contamination by a second
sequence
Mixed sequence content throughout the length of the trace.
Figure 15
Off‑target amplification
Mixed sequence content after the primer region.
Figure 16
Homopolymers
Long stretch of one base type leads to mixed sequence or
excessive baseline noise (typically observed in stretches >9
bases)
Figure 17
Heterozygous insertions or
deletions
Mixed sequence content starting at a specific point.
Figure 18
Primer dimers
The presence of a secondary sequence in the 5¢ end, roughly
the length of the two PCR primers added together.
Figure 19
Mixed sequence content
Signal saturation
High sample signal causes saturation of the CCD camera. Signal saturation causes
pull-up spectral peaks that cannot be corrected by spectral calibration. These pull-up
spectral peaks are mobility corrected in the Analyzed sequence and can be incorrectly
identified as minor peaks (see “Example of the impact of minor signal saturation on
minor variant detection“ on page 8. Extreme signal saturation will appear as mixed
sequence content (see Figure 7).
Any degree of signal saturation can impact minor variant detection.
Note: The 3500 Data collection software flags .ab1 files with off-scale peaks. You must
manually check for off-scale peaks from data generated with the 3130 or 3730 Genetic
Analyzer platforms.
6
User Bulletin: Troubleshooting Sanger sequencing data
Troubleshooting Sanger sequencing data
Signal saturation
Examples of
signal saturation
The following figure shows examples of signal saturation. The red line indicates the
maximum raw signal recommended.
1
2
3
Figure 5 Signal saturation – Raw data view
Severe signal saturation on a 3130
Genetic Analyzer
2 Minor signal saturation on a 3130 Genetic
Analyzer
1
User Bulletin: Troubleshooting Sanger sequencing data
3
Minor signal saturation on a 3500 Genetic
Analyzer
7
Troubleshooting Sanger sequencing data
Signal saturation
Example of the
impact of minor
signal saturation
on minor variant
detection
Figure 6 is an example of signal saturation that causes spectral pull-up peaks that can
be incorrectly identified as a minor variant.
Figure 6 Pull‑up peaks in raw vs. analyzed data from a sample with minor signal
saturation on the 3130 instrument
The black arrows in the top panel highlight two G off‑scale peaks with flattened tops that have
saturated the camera and caused the pull‑up peaks. The green arrows point to the pull‑up
peaks in the raw data (top) and in the analyzed data (bottom) that could be mistaken for true
minor variants depending on their location in the electropherogram and the basecalled position
after mobility correction.
8
User Bulletin: Troubleshooting Sanger sequencing data
Troubleshooting Sanger sequencing data
Signal saturation
Example of
extreme signal
saturation
Figure 7 shows what appears as mixed sequence caused by extreme signal saturation.
Figure 7 Extreme signal saturation
Mixed sequence in the Analyzed view due to extreme signal saturation. A quick review of the raw
data can help diagnose a scenario such as this; the raw data view of the analyzed sample shown
here is shown in the top panel in Figure 5.
Signal saturation: possible causes and recommended actions
Possible cause
Too much template was used in the sequencing reaction
resulting in too much sequencing product.
Recommended action
If the sample has been on instrument <24 hours, reduce
injection time in run module, then re‑inject the sample.
If the sample is purified with the BigDye XTerminator™
Purification Kit and has been on instrument <24 hours,
carefully remove 10 µL of sample off the BigDye
XTerminator™ beads in the plate, then add 10 µL of
0.1 mM EDTA to dilute the sample. Re‑inject the sample
using a standard run module (non‑BigDye XTerminator™
module).
Repeat the sequencing reaction using less template.
Water was used as the injection solution.
Use Hi‑Di™ Formamide or a 0.1 mM EDTA injection
solution for samples.
Note: Using water as an injection solution causes highly
variable quantities of DNA to be injected, because there is
no competition for the charged DNA/salts.
User Bulletin: Troubleshooting Sanger sequencing data
9
Troubleshooting Sanger sequencing data
Low signal intensity
Low signal intensity
Low signal intensity can be caused by many factors including thermal cycler
malfunction (in the case of an entire plate failure) and insufficient sequencing
template quantity/quality. Raw signal <500 RFU makes detection of minor variants
more difficult.
Examples of low
signal intensity
The examples below show moderately and severely low signal traces.
Figure 8 Moderately low signal intensity in a sample with an average raw signal‑to‑noise
ratio of ~ 75
Figure 9 Severely low signal intensity due to hardware failure or a failed reaction
More severe signal issues are often related to poor injection, failed reaction, or a blocked or
broken capillary.
10
User Bulletin: Troubleshooting Sanger sequencing data
Troubleshooting Sanger sequencing data
Low signal intensity
Low signal:
possible causes
and recommended
actions
Note: When sequencing signal is weak, increasing the injection time (re-injecting
sample) or increasing primer and/or template in the cycle sequencing reactions can
improve signal strength if DNA quality, PCR purification, and sequencing reaction
purification steps have been performed properly.
Possible cause
Poor template quality.
Recommended action
Check DNA quality. If necessary, clean up the templates.
Check the sequencing reaction for the DNA template
control in order to check sequencing reaction quality.
Insufficient primer or template in the cycle sequencing
reaction.
Check DNA quantity. Use the amounts recommended per
PCR and sequencing kits. Check the DNA template
control to determine sequencing reaction quality.
The amount of BigDye™ Reaction Mix in the reactions was Follow recommended procedures to prepare sequencing
insufficient; the sequencing chemistry was too dilute.
reactions with BigDye™ Reaction Mixes.
Sample contains salts from insufficient purification of
templates, PCR products, or sequencing reactions with
ethanol precipitation. Salts in the sample interfere with
proper electrokinetic injection.
Review DNA quality, PCR purification, and sequencing
reaction purification steps.
Sample volume is too low.
Resuspend samples using sufficient volumes (10 µL).
Instrument run buffer is old.
Replace the buffer according to the procedures in your
instrument user guide.
Injection failed.
Verify correct run module was used.
Verify correct volume in well.
Verify capillaries are not broken.
Verify that data quality for capillaries is consistent and
not trending downward. A decrease in data quality from a
specific capillary can indicate blockage.
Sample evaporated because water was used as the
injection solution.
Use Hi‑Di™ Formamide to resuspend sample.
Use new plate septa or check plate septa for wear.
Add more resuspension solution to the samples before
loading them.
Inject soon after plate is placed on the instrument.
Autosampler alignment is off and the tips did not enter
the sample.
Verify the correct run module was used.
If you are using samples purified with the BigDye
XTerminator™ Purification Kit and your auto sampler was
recently calibrated, run the BDX Update utility.
Select: Start> All Programs> AppliedBiosystems> BDX
Updater. (The utility is installed with the BigDye
XTerminator™ run modules.)
User Bulletin: Troubleshooting Sanger sequencing data
11
Troubleshooting Sanger sequencing data
Dye blobs
Possible cause
Recommended action
Autosampler alignment is off and the tips did not enter
the sample.
Contact Technical Support to arrange a service engineer
visit if alignment problem is not solved by the actions
above.
Thermal cycler failure.
Contact Technical Support.
Dye blobs
Dye blobs are caused by unincorporated dye terminators remaining in solution after
purification of the cycle sequencing reactions. Unincorporated dye terminators from
the BigDye™ Terminator v3.1 Cycle Sequencing Kit and BigDye™ Direct Cycle
Sequencing Kit are most commonly seen to co-migrate with the ~ 85–90 bp labeled
fragments. In more severe instances, these blobs can also be detected at ~ 60–65 bp
and within 125–140 bp regions. Dye blobs are typically seen as broad “C” or “T”
peaks, but can also show up as “G” blobs. Dye blobs are more common when first
testing new sequencing purification methods.
Example of dye
blobs
Figure 10 shows severe dye blobs in the 60–65bp, 85–100bp, and 125–140bp regions.
Although the sequence quality appears high, the blobs obscure nearly 40 bp of the 100
bases displayed. This would make the sequence unsuitable for variant detection.
Figure 10 Severe dye blobs in the 60–65bp and 125–140bp regions
12
User Bulletin: Troubleshooting Sanger sequencing data
Troubleshooting Sanger sequencing data
Dye blobs
Dye blobs: possible causes and recommended actions
Possible cause
Recommended action
Sample bypassed the purification material when using
spin columns/spin plates for sequencing clean‑up.
Ensure transfer of the sample to the center of the
purification material. Sample dispensed along the walls
of the clean‑up column may bypass the purification
material. Use a single channel pipette and/or position the
tip directly above the spin column/plate while dispensing
at low speed.
Ethanol concentration is too high during ethanol
precipitation. This leads to unincorporated dye
terminators and salts precipitating with the sequencing
product.
Repeat procedure with correct ethanol concentration.
Incorrect ratio of BigDye XTerminator™ reagents.
Vortex theBigDye XTerminator™ Solution bulk container at
maximum speed for at least 10 seconds before
dispensing. If you pre‑mix the SAM/BDX solution, ensure
that the solution is well mixed before each sample well
dispense step to maintain the appropriate ratio of
reagents.
Insufficient mixing during the vortexing step when using
the BigDye XTerminator™ Purification Kit.
Verify that the plate is firmly attached to the vortexer.
Follow the protocol for vortexing.
User Bulletin: Troubleshooting Sanger sequencing data
13
Troubleshooting Sanger sequencing data
G/C compression
G/C compression
G/C compression is often a result of too much sequencing template or incomplete
denaturation of GC-rich regions of sequencing template, leading to subtle G or C peak
shoulders or un-resolvable regions of GC bases.
Example of G/C
compression
The following examples of G/C compression are often a result of too much sequencing
template, a potential by-product of too much input DNA, and can show up near the
260–270 bp region of the electropherogram when using BigDye™ Direct.
Figure 11 G/C compression due to an excess of sequencing template
A BigDye™ Direct sequencing sample with the raw data trace on top and the basecalled/analyzed
data trace on the bottom. Subtle G/C peak shoulders are observed with poor resolution of a
triplet of G peaks.
Figure 12 G/C compression due to GC‑rich templates
Compressions encountered using the dGTP BigDye™ Terminator Kit, an alternative non-standard
kit for GC‑rich templates.
14
User Bulletin: Troubleshooting Sanger sequencing data
Troubleshooting Sanger sequencing data
G/C dye terminator degradation
G/C compression: possible causes and recommended actions
Possible cause
Recommended action
Too much input gDNA during PCR, leading to excessive
sequencing template when using the BigDye™ Direct
Cycle Sequencing Kit.
If using the BigDye™ Direct Cycle Sequencing Kit, reduce
the amount of input DNA; ensure £20 ng is used.
GC‑rich regions, especially when sequencing with dGTP
sequencing chemistry. Possibly caused by incomplete
denaturation of the synthesized DNA.
No corrective action is known at this time. Using a
different sequencing primer located closer to the GC-rich
region may help to resolve G/C compression.
Reinject the sample. In most cases, compressions are
eliminated after the first injection.
G/C dye terminator degradation
Sequencing reactions covered by septa or MicroAmp™ Clear Adhesive Film and
resuspended in Hi-Di™ Formamide are generally stable for up to 12–24 hours at room
temperature when protected from light, heat, acidic conditions, bleach, and air.
However, prolonged exposure to environmental conditions, such as prolonged
storage at room temperature, leads to degradation of the dye terminators, especially
the G and C dyes. G and C dye degradation can lead to decreased signal, increased
baseline noise, and minor n+1 secondary peaks that can impact the ability to detect
minor sequencing variants.
Example of G/C
dye terminator
degradation
Figure 13 shows examples of G and C dye terminator degradation.
Figure 13 G and C terminator degradation
The top panel shows G dye terminator degradation, while the bottom panel shows C dye
terminator degradation.
User Bulletin: Troubleshooting Sanger sequencing data
15
Troubleshooting Sanger sequencing data
Mixed sequence content overview
G/C degradation: possible causes and recommended actions
Possible cause
Recommended action
The dye labels attached to the dd‑terminators are
degraded. Initial degradation results in shoulders on C
and/or G peaks that can be mistaken as secondary peaks.
With further degradation, the C and/or G peaks appear
small or rough or disappear completely.
Protect the fluorescently‑labeled DNA from light, heat,
acidic conditions, and oxygen.
Sequencing reactions were exposed to light, heat, acidic
conditions, bleach, and/or oxygen before they were
loaded onto the instrument.
Use tube septa or a heat seal to prevent exposure to air
and evaporation of samples.
Water was used as the injection solution.
Note: Resuspending samples in water leads to
breakdown of C and/or G‑labeled fragments.
The Hi‑Di™ Formamide is degraded.
If no C peaks are visible, repeat the sequencing reactions
with fresh reagents.
Do not leave samples on the instrument for more than 24
hours.
Resuspend the samples in Hi‑Di™ Formamide or 0.1 mM
EDTA.
Resuspend the samples using a newer lot of Hi‑Di™
Formamide.
Aliquot large lots of Hi‑Di™ Formamide into smaller tubes
to minimize freeze/thaw cycles and store at –20°C.
Mixed sequence content overview
Contaminating mixed sequence content, in which a secondary sequence contaminates
the primary sequence, has many causes, including:
• “Primer impurity“ on page 17
• “Contamination“ on page 17
• “Off-target amplification“ on page 18
• “Homopolymers“ on page 19
• “Heterozygous insertions or deletions“ on page 20
• “Primer dimers“ on page 21
Note: Mixed sequence can also be caused by signal saturation and low peak intensity.
See “Signal saturation“ on page 6 and “Low signal intensity“ on page 10 for
troubleshooting information.
16
User Bulletin: Troubleshooting Sanger sequencing data
Troubleshooting Sanger sequencing data
Mixed sequence content overview
Primer impurity
Figure 14 shows mixed sequence caused by the intentional introduction of a 10% (n-1)
forward primer impurity.
Figure 14 Example of mixed sequences due to a 10% primer impurity
Two sequencing primers were pooled 1:9. Standard sequencing primers (90%) were mixed with
primers that had the 3¢ base removed (10%). The secondary peaks exhibit a minor n-1 peak at
roughly the same % as the primer impurity introduced.
Primer impurity: possible causes and recommended action
Possible cause
Insufficient PCR primer purification.
Recommended action
Repeat using HPLC‑purified primers.
Insufficient sequencing primer purification.
Contamination
Secondary sequence contamination results in mixed sequence content as shown in
Figure 15.
Figure 15 Secondary sequence contamination caused by well‑to‑well contamination of
one sample into another
User Bulletin: Troubleshooting Sanger sequencing data
17
Troubleshooting Sanger sequencing data
Mixed sequence content overview
Contamination: possible causes and recommended actions
Possible cause
Recommended action
Carryover from contaminated septa.
Replace septa, then change the buffer/water/ waste.
Cross‑contamination of the specimen, amplicons, or
primers.
Review DNA quality.
Replace pipette tips before aspirating primers.
Carefully remove adhesive seal post‑PCR to avoid the
amplicon contaminating adjacent wells.
Replace pipette tips each time amplicons are dispensed
for sequencing reactions.
Incomplete PCR product clean-up.
Remove PCR primers completely before using PCR
products as sequencing templates.
Secondary amplification product in the PCR product used Use gel purification to isolate the desired product or
design new PCR primers to obtain a single product. We
as a sequencing template.
recommend the Primer Designer™ Tool at http://
www.thermofisher.com/primerdesigner.
Off-target
amplification
The entire PCR-specific primer region can be clearly seen when using the BigDye™
Direct Cycle Sequencing Kit with the BigDye™ Direct forward M13 primer. Off-target
amplification can clearly be seen after the gene-specific priming region.
Figure 16 Example of off‑target or secondary amplification product
Off-target amplification after the primer.
Off-target amplification: possible causes and recommended actions
Possible cause
Recommended action
PCR product with secondary amplification (>1 sequencing Use gel purification to isolate the desired product or
design new PCR primers to obtain a single product. We
product signal) used as a sequencing template.
recommend the Primer Designer™ Tool at http://
www.thermofisher.com/primerdesigner.
Increase the PCR specificity (increase annealing
temperature, lower MgCl2, or change primer designs.)
18
User Bulletin: Troubleshooting Sanger sequencing data
Troubleshooting Sanger sequencing data
Mixed sequence content overview
Possible cause
Recommended action
PCR product with secondary amplification (>1 sequencing Re‑examine the sequence for primer site homology.
product signal) used as a sequencing template.
More than one priming site (either upstream or
downstream) is present on the sequencing template.
Homopolymers
Homopolymer stretches longer than 8–9 consecutive bases can lead to n+/-1 peaks
(one too many or few of the base type) and cause mixed sequence content and
increased baseline noise after the homopolymer.
Excessive 2–3 base repeats may also result in similar noise patterns.
Figure 17 Example of a mixed sequence after a homopolymer
A string of “A” bases leads to mixed sequence content.
Homopolymers: possible cause and recommended actions
Possible cause
A partially extended primer and template dissociate and
then re‑anneal improperly before extension continues.
Recommended action
Use anchored sequencing primers during PCR
amplification.
Design PCR primers to exclude homopolymer regions
greater than‑ 8–9 bp long if possible.
User Bulletin: Troubleshooting Sanger sequencing data
19
Troubleshooting Sanger sequencing data
Mixed sequence content overview
Heterozygous
insertions or
deletions
A heterozygous insertion or deletion is a scenario in which one allele contains a
specific insertion or a deletion that the other allele lacks, resulting in mixed sequence
content following the point of the particular insertion or deletion. This is a true
biological event and can usually be confirmed through alignment of both forward and
reverse trace files against a reference sequence.
Figure 18 Example of mixed sequence content following a heterozygous insertion or
deletion. Mixed sequence content is seen in both forward and reverse traces.
Heterozygous insertions or deletions: possible cause and recommended actions
Possible cause
Heterozygous insertion or deletion mutation.
Recommended action
Assemble both forward and reverse sequence data to
determine sequence of insertion or deletion.
Re‑design PCR target to exclude the region where the
insertion or deletion occurs if additional minor variants
are suspected to be present.
20
User Bulletin: Troubleshooting Sanger sequencing data
Troubleshooting Sanger sequencing data
Sequence analysis tools
Primer dimers
Primer dimers are an unwanted by-product of poorly-designed PCR primers. These
forward and reverse PCR primers hybridize to each other because of complimentary
3¢ bases that enable elongation to occur for the length of each primer.
Figure 19 shows the presence of two secondary sequences roughly the length of the
two PCR primers added together. This is typical of a primer dimer event.
Figure 19 Example of mixed sequence due to primer dimers
The top panel shows the Raw data and the bottom panel shows the Analyzed data. The primer
dimer in the Analyzed data is a relatively clean example. However, some amount of excess
mixed sequence noise can sometimes be seen even beyond the PCR‑specific primer region.
Primer dimers: possible cause and recommended actions
Possible cause
Primer dimers are formed when PCR primers anneal and
amplify to create short PCR products incorporating only
the two primer sequences. They can then act as
templates, albeit very short, in the subsequent
sequencing reaction.
Recommended action
Redesign the PCR primers to eliminate the sequences
that allow primer‑dimer formation.
Use a “hot start” PCR enzyme to inhibit primer‑dimer
formation.
Sequence analysis tools
Minor Variant
Finder Software
The Minor Variant Finder Software is a simple, easy-to-use desktop software designed
for the accurate detection and reporting of minor variants (<25% of a major peak) or
50:50 mixtures as found in a germline heterozygous positions by Sanger Sequencing.
By comparing test specimen and control traces, the software generates a
noise-minimized electropherogram for confirmation of minor variants in forward and
reverse sequences. The software can detect variants (SNPs or SNVs) with a Limit of
User Bulletin: Troubleshooting Sanger sequencing data
21
Troubleshooting Sanger sequencing data
Sequence analysis tools
Detection (LOD) of 5% with high-quality data in amplicons of lengths 150 to 500 bp.
LOD is defined as the lowest level at which sensitivity ≥95% and specificity ≥99%
within the overlapping region of forward and reverse test and control .ab1 files.
Note: LOD was determined using 5% mixtures that were experimentally created with
physical mixtures of molecules, and is not based on peak height ratios in
electropherograms.
The software also includes an optional NGS confirmation function.
The Minor Variant Finder Software runs in a web browser window, but does not
require connection to the internet in order to run. Data is secure on your desktop
computer.
Sequence Scanner
Software
Sequence Scanner 2 is free software for viewing electropherograms. It provides an
easy way to perform a high-level sequencing data quality check or general data
review that includes summary tables and electropherograms as well as a general .ab1
file raw/analyzed data view.
To obtain the software, go to: http://resource.thermofisher.com/pages/WE28396/.
Next-generation
confirmation
(NGC) module
The Applied Biosystems™ Analysis Module Next-Generation Confirmation (NGC) is
CE Sanger sequencing software hosted on the Thermo Fisher Cloud environment. The
software allows you to examine variants from a CE electropherogram to confirm the
variants detected by Next Generation Sequencing (NGS) platforms. The software
analyzes CE sequencer-generated .ab1 files and performs SNP detection and analysis,
SNP discovery and validation, and sequence confirmation, all on the cloud. NGC
software can automatically retrieve reference sequences from genomic databases,
report variants in genomic coordinates, and report genomic annotations for SNPs. The
software analyzes NGS variant .vcf files and analyzes NGS variants and Sanger
variants in the same alignment view. The software can also generate a Venn diagram,
allowing you to visually compare and confirm variants generated from NGS. In
addition, the NGC software generates and exports variants in standard variant call
format (VCF).
Variant Reporter™
Software
This software performs comparative sequencing, also known as direct sequencing,
medical sequencing, PCR sequencing, and resequencing with DNA sequencing files.
The software is designed for reference-based and non-reference-based analysis such
as mutation detection and analysis, SNP discovery and validation, and sequence
confirmation. The robust algorithms will call SNPs, mutations, insertions, deletions,
and heterozygous insertions or deletions for data generated using the Applied
Biosystems™ genetic analyzers.
To obtain the software, go to: https://www.thermofisher.com/order/catalog/product/
4475006.
22
User Bulletin: Troubleshooting Sanger sequencing data
Troubleshooting Sanger sequencing data
Related documentation
Related documentation
Document
Publication number
Description
DNA Sequencing by Capillary
Electrophoresis Chemistry Guide
4305080
This chemistry guide is designed to familiarize you with
Applied Biosystems™ genetic analyzers for automated
DNA sequencing by capillary electrophoresis, to provide
useful tips for ensuring that you obtain high‑quality data,
and to help troubleshoot common problems.
Generating high‑quality data
using the BigDye™ Direct Cycle
Sequencing Kit
MAN0014436
Effective minor variant detection with Minor Variant
Finder Software requires high‑quality sequencing data
with minimal noise. This document provides a
demonstrated protocol for generating high‑quality data
for use in minor variant detection using:
• BigDye™ Direct Cycle Sequencing Kit
• Applied Biosystems™ genetic analyzers with POP-7™
polymer
Generating high‑quality data
using the BigDye™ Terminator
v3.1 Cycle Sequencing Kit
MAN0014628
Effective minor variant detection with Minor Variant
Finder Software requires high‑quality sequencing data
with minimal noise. This document provides a protocol
for generating high‑quality data for use in minor variant
detection using:
• BigDye™ Terminator v3.1 Cycle Sequencing Kit
• Applied Biosystems™ genetic analyzers with POP-6™
or POP-7™ polymer
User Bulletin: Troubleshooting Sanger sequencing data
23
The information in this guide is subject to change without notice.
DISCLAIMER
TO THE EXTENT ALLOWED BY LAW, LIFE TECHNOLOGIES AND/OR ITS AFFILIATE(S) WILL NOT BE LIABLE FOR SPECIAL, INCIDENTAL, INDIRECT, PUNITIVE,
MULTIPLE, OR CONSEQUENTIAL DAMAGES IN CONNECTION WITH OR ARISING FROM THIS DOCUMENT, INCLUDING YOUR USE OF IT.
Important Licensing Information
These products may be covered by one or more Limited Use Label Licenses. By use of these products, you accept the terms and conditions of all applicable Limited
Use Label Licenses.
Corporate entity
Life Technologies | Carlsbad, CA 92008 USA | Toll Free in USA 1.800.955.6288
TRADEMARKS
All trademarks are the property of Thermo Fisher Scientific and its subsidiaries unless otherwise specified.
©2016 Thermo Fisher Scientific Inc. All rights reserved.
For support visit thermofisher.com/support or email [email protected]
thermofisher.com
15 January 2016
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement