Local Run Manager TruSight Tumor 15 Analysis Module Workflow Guide (1000000006976 v00)

Local Run Manager TruSight Tumor 15 Analysis Module Workflow Guide (1000000006976 v00)
Local Run Manager
TruSight Tumor 15 Analysis Module
Workflow Guide
For Research Use Only. Not for use in diagnostic procedures.
Overview
Set Parameters
Analysis Methods
View Analysis Results
Analysis Report
Analysis Output Files
Technical Assistance
ILLUMINA PROPRIETARY
Document # 1000000006976 v00
January 2016
3
4
6
8
9
10
19
This document and its contents are proprietary to Illumina, Inc. and its affiliates ("Illumina"), and are intended solely for the
contractual use of its customer in connection with the use of the product(s) described herein and for no other purpose. This
document and its contents shall not be used or distributed for any other purpose and/or otherwise communicated, disclosed,
or reproduced in any way whatsoever without the prior written consent of Illumina. Illumina does not convey any license
under its patent, trademark, copyright, or common-law rights nor similar rights of any third parties by this document.
The instructions in this document must be strictly and explicitly followed by qualified and properly trained personnel in order
to ensure the proper and safe use of the product(s) described herein. All of the contents of this document must be fully read
and understood prior to using such product(s).
FAILURE TO COMPLETELY READ AND EXPLICITLY FOLLOW ALL OF THE INSTRUCTIONS CONTAINED HEREIN
MAY RESULT IN DAMAGE TO THE PRODUCT(S), INJURY TO PERSONS, INCLUDING TO USERS OR OTHERS, AND
DAMAGE TO OTHER PROPERTY.
ILLUMINA DOES NOT ASSUME ANY LIABILITY ARISING OUT OF THE IMPROPER USE OF THE PRODUCT(S)
DESCRIBED HEREIN (INCLUDING PARTS THEREOF OR SOFTWARE).
© 2016 Illumina, Inc. All rights reserved.
Illumina, 24sure, BaseSpace, BeadArray, BlueFish, BlueFuse, BlueGnome, cBot, CSPro, CytoChip, DesignStudio,
Epicentre, ForenSeq, Genetic Energy, GenomeStudio, GoldenGate, HiScan, HiSeq, HiSeq X, Infinium, iScan, iSelect,
MiSeq, MiSeqDx, MiSeq FGx, NeoPrep, NextBio, Nextera, NextSeq, Powered by Illumina, SureMDA, TruGenome,
TruSeq, TruSight, Understand Your Genome, UYG, VeraCode, verifi, VeriSeq, the pumpkin orange color, and the
streaming bases design are trademarks of Illumina, Inc. and/or its affiliate(s) in the U.S. and/or other countries. All other
names, logos, and other trademarks are the property of their respective owners.
The Local Run Manager TruSight Tumor 15 analysis module aligns reads against the
reference specified in the manifest files using the banded Smith-Waterman algorithm.
After alignment, the somatic variant caller performs variant analysis. The results report
somatic variants of a set of reference panel genes associated with cancer. This workflow
is designed specifically for TruSight Tumor 15 libraries.
Input Requirements
In addition to sequencing data files generated during the sequencing run, such as base
call files, the TruSight Tumor 15 analysis module requires the following files.
} Manifest files (2)—The TruSight Tumor 15 analysis module requires 2 assay-specific
manifest files: a manifest file for mix A and a manifest file for mix B. Manifest files
are included as part of in the analysis module.
} Reference genome—The TruSight Tumor 15 analysis module requires the hg19
reference genome for coordinates and chromosome mapping, which is included in
with the Local Run Manager software installation.
About This Guide
This guide provides instructions for setting up run parameters for sequencing and
analysis parameters for the TruSight Tumor 15 analysis module. For information about
the Local Run Manager dashboard and system settings, see the Local Run Manager
Software Guide (document # 1000000002702).
Local Run Manager TruSight Tumor 15 Analysis Module Workflow Guide
3
Overview
Overview
Set Parameters
1
Click Create Run, and select TruSight Tumor 15.
2
Enter a run name that identifies the run from sequencing through analysis.
Use alphanumeric characters, spaces, underscores, or dashes.
3
[Optional] Enter a run description to help identify the run.
Use alphanumeric characters.
Specify Samples for the Run
Specify samples for the run using the following options:
} Enter samples manually—Use the blank table on the Create Run screen.
} Import samples—Navigate to an external file in a comma-separated values (*.csv)
format. A template is available for download on the Create Run screen.
After you have populated the samples table, you can export the sample information to
an external file, and use the file as a reference when preparing libraries or import the file
for another run.
Enter Samples Manually
1
Click Add Row to adjust the samples table to an appropriate number of rows.
2
Enter a unique sample ID in the Sample ID field.
Use alphanumeric characters, dashes, or underscores.
3
[Optional] Enter a sample description in the Sample Description field.
Use alphanumeric characters, dashes, underscores, or spaces.
4
Enter index adapters for Mix A as follows.
a
b
c
5
Expand the Mix A Index 1 drop-down list and select an Index 1 adapter.
Expand the Mix A Index 2 drop-down list and select an Index 2 adapter. Rightclick in a table cell to use the Fill Down command.
[Optional] Enter a mix description for Mix A.
Enter index adapters for Mix B as follows.
a
b
c
Expand the Mix B Index 1 drop-down list and select an Index 1 adapter.
Expand the Mix B Index 2 drop-down list and select an Index 2 adapter. Rightclick in a table cell to use the Fill Down command.
[Optional] Enter a mix description for Mix B.
NOTE
The Report Definition field is populated automatically, by default.
6
[Optional] Click the Export Samples to export sample information to an external file.
7
When finished, click Save Run.
Import Samples
4
1
Click Template. The template file contains the correct column headings for import.
2
Enter the sample information in each column for the samples in the run, and then
save the file.
Document # 1000000006976 v00
Click Import Samples and browse to the location of the sample information file.
4
When finished, click Save Run.
Local Run Manager TruSight Tumor 15 Analysis Module Workflow Guide
Set Parameters
3
5
Analysis Methods
The TruSight Tumor 15 analysis module performs the following analysis steps and then
writes analysis output files to the Analysis folder.
} Demultiplexes index reads
} Generates FASTQ files
} Aligns to a reference
} Identifies variants
Demultiplexing
Demultiplexing compares each Index Read sequence to the index sequences specified for
the run. No quality values are considered in this step.
Index reads are identified using the following steps:
} Samples are numbered starting from 1 based on the order they are listed for the run.
} Sample number 0 is reserved for clusters that were not assigned to a sample.
} Clusters are assigned to a sample when the index sequence matches exactly or when
there is up to a single mismatch per Index Read.
FASTQ File Generation
After demultiplexing, the software generates intermediate analysis files in the FASTQ
format, which is a text format used to represent sequences. FASTQ files contain reads for
each sample and the associated quality scores. Any controls used for the run and
clusters that did not pass filter are excluded.
Each FASTQ file contains reads for only 1 sample, and the name of that sample is
included in the FASTQ file name. In the TruSight Tumor 15 workflow, 2 FASTQ files are
generated per sample, 1 from the Mix A library, and 1 from the Mix B library. FASTQ
files are the primary input for alignment.
Read Stitching
The TruSight Tumor 15 analysis module performs read stitching by default.
When enabled, paired-end reads that overlap are stitched to form a single read in the
FASTQ file. At each overlap position, the consensus stitched read has the base call and
quality score of the read with higher Q-score.
For each paired read, a minimum of 10 bases must overlap between Read 1 and Read 2
to be a candidate for read stitching. The minimum threshold of 10 bases minimizes the
number of reads that are stitched incorrectly due to a chance match. Candidates for read
stitching are scored as follows:
} For each possible overlap of 10 base pairs or more, a mismatch score is calculated.
Perfectly matched overlaps have a MismatchRate of 0, resulting in a score of 1.
} If the best overlap has a score of ≥ 0.9 and the score is ≥ 0.1 higher than any other
candidate, then the reads are stitched together at this overlap.
} Paired-end reads that cannot be stitched are converted to 2 single reads in the
FASTQ file.
Although the stitched reads are aligned as a single sequence, the stitched read is split
into individual alignments in the BAM file.
6
Document # 1000000006976 v00
During the alignment step, the banded Smith-Waterman algorithm aligns clusters from
each sample against amplicon sequences specified in the manifest file.
The banded Smith-Waterman algorithm performs local sequence alignments to
determine similar regions between 2 sequences. Instead of comparing the total sequence,
the Smith-Waterman algorithm compares segments of all possible lengths. Local
alignments are useful for dissimilar sequences that are suspected to contain regions of
similarity within the larger sequence. This process allows alignment across small
amplicon targets, often less than 10 bp.
Each paired-end read is evaluated in terms of its alignment to the relevant probe
sequences for that read.
} Read 1 is evaluated against the reverse complement of the Downstream LocusSpecific Oligos (DLSO).
} Read 2 is evaluated against the Upstream Locus-Specific Oligos (ULSO).
} If the start of a read matches a probe sequence with no more than 1 mismatch, the
full length of the read is aligned against the amplicon target for that sequence.
Alignments that include more than 3 indels are filtered from alignment results. Filtered
alignments are written in alignment files as unaligned and are not used in variant
calling.
Variant Calling
Developed by Illumina, the somatic variant caller identifies variants present at low
frequency in the DNA sample.
The somatic variant caller identifies SNPs in 3 steps:
} Considers each position in the reference genome separately
} Counts bases at the given position for aligned reads that overlap the position
} Computes a variant score that measures the quality of the call using Poisson model.
Variants are first called for each library separately. Then, variants from each library are
compared and combined into a single output file. If a variant meets the following
criteria, the variant is marked as PASS in the variant call (VCF) file:
} The variant is present in both libraries
} Has a cumulative depth of 1000 or an average depth of 500x per library
} Has a variant frequency of ≥ 2.6% as reported in the merged VCF file
A locus for a mutation or reference is classified as a no call under the following
conditions:
} The variant frequency is near the signal noise level between 1% and 2.6%
} The variant quality is < Q30
} The depth is < 500
} Significant strand bias is detected
} The indel occurs in a homopolymer region
Local Run Manager TruSight Tumor 15 Analysis Module Workflow Guide
7
Analysis Methods
Alignment
View Analysis Results
8
1
From the Local Run Manager dashboard, click the run name.
2
From the Run Overview tab, review the sequencing run metrics.
3
[Optional] Click the Copy to Clipboard
4
Click the Sequencing Information tab to review run parameters and consumables
information.
5
Click the Samples and Results tab.
6
If analysis was repeated, expand the Select Analysis drop-down and select the
appropriate analysis.
7
[Optional] Click the Copy to Clipboard
icon for access to the output run folder.
icon for access to the Analysis folder.
Document # 1000000006976 v00
Analysis results are summarized on the Samples and Results tab. The report is also
available in a PDF and *.txt file format for each sample and as an aggregate report in the
Analysis folder.
Sample Information
Table 1 Sample Information Table
Column Heading
Description
Sample ID
The sample ID provided when the run was created.
Sample Description
The sample description, if provided.
Run ID
The name of the run folder.
Variant Results
Table 2 Variant Results Table
Column Heading
Description
Gene
The gene where the SNV, insertion, or deletion is detected.
Amino Acid Change
Human Genome Variation Society (HGVS) protein notation.
Variant Type
Consequence on protein function.
Nucleotide Change
HGVS nucleotide notation.
Variant Frequency
Fraction of reads in which the variant was detected.
Transcript
Ensembl canonical transcript.
No Calls
Table 3 No Calls Table
Column Heading
Gene
Chromosome
Coordinate
Failed Filter
Description
The gene where the no call is located.
The chromosome where the no call is located.
The coordinate where the no call is located.
The reason for the no call.
Low Variant Frequency—The variant frequency is below a
cutoff. Identical to LowVariantFreq in the VCF File Filter
entry.
Low Coverage—The depth of coverage is below a cutoff.
Identical to LowDP in the VCF File Filter entry.
Low Genotype Quality—The genotyping quality is below a
cutoff. Identical to LowGQ in the VCF File Filter entry.
Indel Reference Repeat—For an indel, the number of
adjacent repeats (1-base or 2-base) in the reference is greater
than 8. Identical to R8 in the VCF File Filter entry.
Strand Bias—The strand bias is more than the given
threshold. Identical to SB in the VCF File Filter entry.
Local Run Manager TruSight Tumor 15 Analysis Module Workflow Guide
9
Analysis Report
Analysis Report
Analysis Output Files
The following analysis output files are generated for the TruSight Tumor 15 analysis
module and provide analysis results for alignment and variant calling. Analysis output
files are located in the Analysis folder.
File Name
Description
Demultiplexing (*.demux)
Intermediate files containing demultiplexing results.
FASTQ (*.fastq.gz)
Intermediate files containing quality scored base calls.
FASTQ files are the primary input for the alignment step.
Alignment files in the
BAM format (*.bam)
Contains aligned reads for a given sample.
Per-library variant call
files in the VCF format
(*.vcf)
Contains information about variants found at specific
positions in a reference genome.
Variant call files in the
genome VCF format
(*.genome.vcf)
Contains the genotype for each position, whether called
as a variant or called as a reference.
Merged variant call files in
the VCF format (*.vcf)
Contains selected specific coordinates from the gVCF
files for Mix A and Mix B for a final merged VCF file for
the sample.
RunMetricsReport.txt
The Run Metrics Report shows run metrics and
suggested values to determine if run quality results are
within an acceptable range. For Read 1 and Read 2,
shows the average percentage of bases ≥ Q30, which is a
quality score (Q-score) measurement. A Q-score is a
prediction of the probability of a wrong base call.
SampleMetricsReport.txt
Provides calculations from the gVCF file for each sample
in Mix A and Mix B.
Filtered gVCF File
Report.vcf
Provides a report for a subset of variants listed in the
TruSight Tumor 15 Report Definition File.
AmpliconCoverage_M1.tsv
Contains information about coverage per amplicon per
sample for each manifest provided. M# represents the
manifest number.
*.ant files
*.gVCF and filtered *.vcf files information is also
provided in the *.ant file annotation format.
Demultiplexing File Format
The process of demultiplexing reads the index sequence attached to each cluster to
determine from which sample the cluster originated. The mapping between clusters and
sample number are written to 1 demultiplexing (*.demux) file for each tile of the flow
cell.
Demultiplexing files are binary files written to the L001 folder in
Data\Intensities\BaseCalls\L001. The file naming format is s_1_X.demux, where X is
the tile number.
Demultiplexing files start with a header:
10
Document # 1000000006976 v00
Analysis Output Files
}
}
Version (4 byte integer), currently 1
Cluster count (4 byte integer)
The remainder of the file consists of sample numbers for each cluster from the tile.
When the demultiplexing step is complete, a demultiplexing file named
DemultiplexSummaryF1L1.txt is written to the Analysis folder.
} In the file name, F1 represents the flow cell number.
} In the file name, L1 represents the lane number.
} Demultiplexing results in a table with 1 row per tile and 1 column per sample,
including sample 0.
} The most commonly occurring sequences in index reads.
FASTQ File Format
FASTQ file is a text-based file format that contains base calls and quality values per read.
Each record contains 4 lines:
} The identifier
} The sequence
} A plus sign (+)
} The quality scores in an ASCII encoded format
The identifier is formatted as:
@Instrument:RunID:FlowCellID:Lane:Tile:X:Y ReadNum:FilterFlag:0:SampleNumber
Example:
@SIM:1:FCX:1:15:6329:1045 1:N:0:2
TCGCACTCAACGCCCTGCATATGACAAGACAGAATC
+
<>;##=><9=AAAAAAAAAA9#:<#<;<<<????#=
FASTQ File Names
FASTQ files are named with the sample name and the sample number. The sample
number is a numeric assignment based on the order that the sample is listed for the run.
For example:
Data\Intensities\BaseCalls\samplename_S1_L001_R1_001.fastq.gz
} samplename—The sample name listed for the sample. If a sample name is not
provided, the file name includes the sample ID.
} S1—The sample number based on the order that samples are listed for the run
starting with 1. In this example, S1 indicates that this sample is the first sample
listed for the run.
NOTE
Reads that cannot be assigned to any sample are written to a FASTQ file for sample
number 0, and excluded from downstream analysis.
}
}
}
L001—The lane number.
R1—The read. In this example, R1 means Read 1. For a paired-end run, a file from
Read 2 includes R2 in the file name.
001—The last segment is always 001.
FASTQ files are compressed in the GNU zip format, as indicated by *.gz in the file name.
FASTQ files can be uncompressed using tools such as gzip (command-line) or 7-zip
(GUI).
Local Run Manager TruSight Tumor 15 Analysis Module Workflow Guide
11
BAM File Format
A BAM file (*.bam) is the compressed binary version of a SAM file that is used to
represent aligned sequences up to 128 Mb. SAM and BAM formats are described in
detail at https://samtools.github.io/hts-specs/SAMv1.pdf.
BAM files are written to the Analysis folder and use the file naming format of
SampleName_S#.bam, where # is the sample number determined by the order that
samples are listed for the run.
BAM files contain a header section and an alignments section:
} Header—Contains information about the entire file, such as sample name, sample
length, and alignment method. Alignments in the alignments section are associated
with specific information in the header section.
} Alignments—Contains read name, read sequence, read quality, alignment
information, and custom tags. The read name includes the chromosome, start
coordinate, alignment quality, and the match descriptor string.
The alignments section includes the following information for each or read pair:
} RG: Read group, which indicates the number of reads for a specific sample.
} BC: Barcode tag, which indicates the demultiplexed sample ID associated with the
read.
} SM: Single-end alignment quality.
} AS: Paired-end alignment quality.
} NM: Edit distance tag, which records the Levenshtein distance between the read and
the reference.
} XN: Amplicon name tag, which records the amplicon tile ID associated with the
read.
BAM files are suitable for viewing with an external viewer such as IGV or the UCSC
Genome Browser.
BAM index files (*.bam.bai) provide an index of the corresponding BAM file.
VCF File Format
VCF is a widely used file format developed by the genomics scientific community that
contains information about variants found at specific positions in a reference genome.
VCF files use the file naming format SampleName_S#.vcf, where # is the sample number
determined by the order that samples are listed for the run.
VCF File Header—Includes the VCF file format version and the variant caller version.
The header lists the annotations used in the remainder of the file. If MARS is listed, the
Illumina internal annotation algorithm annotated the VCF file. The VCF header includes
the reference genome file and .bam file. The last line in the header contains the column
headings for the data lines.
VCF File Data Lines—Each data line contains information about a single variant.
VCF File Headings
12
Heading
Description
CHROM
The chromosome of the reference genome. Chromosomes appear in
the same order as the reference FASTA file.
Document # 1000000006976 v00
Description
POS
The single-base position of the variant in the reference chromosome.
For SNPs, this position is the reference base with the variant; for indels
or deletions, this position is the reference base immediately before the
variant.
REF
The reference genotype. For example, a deletion of a single T is
represented as reference TT and alternate T. An A to T single nucleotide
variant is represented as reference A and alternate T.
ALT
The alleles that differ from the reference read.
For example, an insertion of a single T is represented as reference A and
alternate AT. An A to T single nucleotide variant is represented as
reference A and alternate T.
QUAL
A Phred-scaled quality score assigned by the variant caller.
Higher scores indicate higher confidence in the variant and lower
probability of errors. For a quality score of Q, the estimated probability
of an error is 10-(Q/10). For example, the set of Q30 calls has a 0.1% error
rate. Many variant callers assign quality scores based on their statistical
models, which are high in relation to the error rate observed.
VCF File Annotations
Heading
Description
FILTER
If all filters are passed, PASS is written in the filter column.
• LowDP—Applied to sites with depth of coverage below a cutoff.
• LowGQ—The genotyping quality (GQ) is below a cutoff.
• LowQual—The variant quality (QUAL) is below a cutoff.
• LowVariantFreq—The variant frequency is less than the given threshold.
• R8—For an indel, the number of adjacent repeats (1-base or 2-base) in the
reference is greater than 8.
• SB—The strand bias is more than the given threshold.
INFO
Possible entries in the INFO column include:
• CSQ—Consequence as predicted by Illumina Annotation Engine (IAE).
• DP—The depth (number of base calls aligned to a position and used in variant
calling).
Local Run Manager TruSight Tumor 15 Analysis Module Workflow Guide
13
Analysis Output Files
Heading
Heading
Description
FORMAT
The format column lists fields separated by colons. For example, GT:GQ.
Available fields include:
• AD—Entry of the form X,Y, where X is the number of reference calls, and Y is
the number of alternate calls.
• GQ—Genotype quality.
• GQX—Genotype quality. GQX is the minimum of the GQ value and the QUAL
column. In general, these values are similar; taking the minimum makes GQX
the more conservative measure of genotype quality.
• GT—Genotype. 0 corresponds to the reference base, 1 corresponds to the first
entry in the ALT column, and so on. The forward slash (/) indicates that no
phasing information is available.
• NL—Noise level; an estimate of base calling noise at this position.
• SB—Strand bias at this position. Larger negative values indicate less bias;
values near 0 indicate more bias. Used with the somatic variant caller and
GATK.
• VF—Variant frequency; the percentage of reads supporting the alternate allele.
SAMPLE
The sample column gives the values specified in the FORMAT column.
Genome VCF Files
Genome VCF (gVCF) files are VCF v4.1 files that follow a set of conventions for
representing all sites within the genome in a reasonably compact format. The gVCF files
include all sites within the region of interest in a single file for each sample.
The gVCF file shows no-calls at positions with low coverage, or where a low-frequency
variant (< 2.6%) occurs often enough (> 1%) that the position cannot be called to the
reference. A genotype (GT) tag of ./. indicates a no-call.
For more information, see sites.google.com/site/gvcftools/home/about-gvcf.
Per-Library and Merged gVCF Files
The TruSight Tumor 15 workflow generates 2 sets of variant call files.
} Per-library VCF and gVCF files—Contains variants called in either library. Perlibrary files are written to the Libraries folder.
} Merged gVCF files—Contain variants called from both libraries. Merged files are
written to the Analysis folder.
Per-library files include both VCF (*.vcf) and gVCF (*.genome.vcf) files, and use the
following naming convention, where S# represents the order that the sample is listed for
the run:
} Reports for all sites—SampleName_S#.genome.vcf
} Reports variants only—SampleName_S#.vcf
Merged gVCF files use the following naming convention:
} Reports for all sites—SampleName.genome.vcf
For filtered gVCF files, see Filtered gVCF File Report on page 15.
Per-Library VCF Files
Variants are called in the Mix A library and the Mix B library to produce an
independent set of VCF files for each library. The set of per-library VCF files include both
VCF and gVCF files.
14
Document # 1000000006976 v00
Merged gVCF Files
The software selects specific coordinates from the gVCF files generated for Mix A and
Mix B to create a final merged VCF file for the sample.
Merged gVCF files are written to the Analysis folder.
Run Metrics Report
Table 4 Run Metrics Report Table
Column Heading
Description
Metric
Type of metric.
Reads PF (%)
The percentage of reads passing filter.
Q30+(R1)
The percentage of reads in Read 1 with a quality score of 30
(Q30) or greater.
Q30+(R2)
The percentage of reads in Read 2 with a quality score of 30
(Q30) or greater.
Value
Percentage of reads.
Sample Metrics Report
Table 5 Sample Metrics Results Table
Column Heading
Description
Sample ID
The sample ID provided when the run was created.
MixABases≥500x (%)
For the sample, the number of bases in library A that have ≥ 500
coverage.
MixBBases≥500x (%)
For the sample, the number of bases in library B that have ≥ 500
coverage.
MixAOnTarget (%)
For the sample, percentage of reads in library A that aligned to
the manifest.
MixBOnTarget (%)
For the sample, percentage of reads in library B that aligned to
the manifest.
Filtered gVCF File Report
The TruSight Tumor 15 workflow provides variant calls for all genes specified in the
manifest files. The workflow also filters gVCF file information for a subset of variants.
The subset is listed in the TruSight Tumor 15 Report Definition File and is included with
the Local Run Manager software installer. When a variant from this list is detected, it is
added to a filtered gVCF file report that is provided in the *.vcf format. For more
information on gVCF files, see Genome VCF Files on page 1.
Filtered gVCF files are written to the Analysis folder and use the following naming
convention:
} SampleName_Report.vcf
Local Run Manager TruSight Tumor 15 Analysis Module Workflow Guide
15
Analysis Output Files
Variants are listed in the VCF file using the following criteria:
} Include variants that were flagged as filtered
} Exclude variants with a variant frequency of less than 2.6%
} Variants that pass filters include PASS in the FILTER column
} Variants that fail filters include the filter name in the FILTER column
} Filter variants due to probe bias (PB) when the variant frequency differs significantly
between libraries
Amplicon Coverage File
An amplicon coverage file is generated for each manifest file. The M# in the file name
represents the manifest number as it is listed in the samples table for the run.
Each file includes a header row that contains the sample IDs associated with the
manifest. Under the header row are 3 columns that list the following information:
} The Target ID as it is listed in the manifest.
} The coverage depth of reads passing filter.
} The total coverage depth.
Supplementary Output Files
The following output files provide supplementary information, or summarize run results
and analysis errors. Although, these files are not required for assessing analysis results,
they can be used for troubleshooting purposes. All files are located in the Analysis folder
unless otherwise specified.
File Name
Description
AnalysisLog.txt
Processing log that describes every step that occurred
during analysis of the current run folder. This file does
not contain error messages.
Located in the root level of the run folder.
AnalysisError.txt
Processing log that lists any errors that occurred
during analysis. This file is present only if errors
occurred.
Located in the root level of the run folder.
CompletedJobInfo.xml
Written after analysis is complete, contains information
about the run, such as date, flow cell ID, software
version, and other parameters.
Located in the root level of the run folder.
DemultiplexSummaryF1L1.txt
Reports demultiplexing results in a table with 1 row
per tile and 1 column per sample.
ErrorsAndNoCallsByLaneTile
ReadCycle.csv
A comma-separated values file that contains the
percentage of errors and no-calls for each tile, read,
and cycle.
Mismatch.htm
Contains histograms of mismatches per cycle and nocalls per cycle for each tile.
AmpliconRunStatistics.xml
Contains summary statistics specific to the run.
Located in the root level of the run folder.
Summary.xml
Contains a summary of mismatch rates and other base
calling results.
Summary.htm
Contains a summary web page generated from
Summary.xml.
Analysis Folder
The analysis folder holds the files generated by the Local Run Manager software.
The relationship between the output folder and analysis folder is summarized as follows:
16
Document # 1000000006976 v00
}
}
}
During sequencing, Real-Time Analysis (RTA) populates the output folder with files
generated during image analysis, base calling, and quality scoring.
RTA copies files to the analysis folder in real time. After RTA assigns a quality score
to each base for each cycle, the software writes the file RTAComplete.xml to both
folders.
When the file RTAComplete.xml is present, analysis begins.
As analysis continues, Local Run Manager writes output files to the analysis folder,
and then copies the files back to the output folder.
Folder Structure
Data
Intensities
BaseCalls
Analysis—Contains *.bam and *.vcf files, and files specific to the analysis
module.
L001—Contains one subfolder per cycle, each containing *.bcl files.
Sample1_S1_L001_R1_001.fastq.gz
Sample2_S2_L001_R1_001.fastq.gz
Undetermined_S0_L001_R1_001.fastq.gz
L001—Contains *.locs files, 1 for each tile.
RTA Logs—Contains log files from RTA software analysis.
InterOp—Contains binary files used by Sequencing Analysis Viewer (SAV).
Logs—Contains log files describing steps performed during sequencing.
Queued—A working folder for software; also called the copy folder.
AnalysisError.txt
AnalysisLog.txt
CompletedJobInfo.xml
QueuedForAnalysis.txt
AmpliconRunStatistics.xml
RTAComplete.xml
RunInfo.xml
runParameters.xml
Analysis Folders
Each time that analysis is requeued, the Local Run Manager creates an Analysis folder
named Analysis_N, where N is a sequential number.
Local Run Manager TruSight Tumor 15 Analysis Module Workflow Guide
17
Analysis Output Files
}
Notes
For technical assistance, contact Illumina Technical Support.
Table 6 Illumina General Contact Information
Website
Email
www.illumina.com
[email protected]
Table 7 Illumina Customer Support Telephone Numbers
Region
Contact Number
Region
North America
1.800.809.4566
Japan
Australia
1.800.775.688
Netherlands
Austria
0800.296575
New Zealand
Belgium
0800.81102
Norway
China
400.635.9898
Singapore
Denmark
80882346
Spain
Finland
0800.918363
Sweden
France
0800.911850
Switzerland
Germany
0800.180.8994
Taiwan
Hong Kong
800960230
United Kingdom
Ireland
1.800.812949
Other countries
Italy
800.874909
Contact Number
0800.111.5011
0800.0223859
0800.451.650
800.16836
1.800.579.2745
900.812168
020790181
0800.563118
00806651752
0800.917.0041
+44.1799.534000
Safety data sheets (SDSs)—Available on the Illumina website at
support.illumina.com/sds.html.
Product documentation—Available for download in PDF from the Illumina website. Go
to support.illumina.com, select a product, then select Documentation & Literature.
Local Run Manager TruSight Tumor 15 Analysis Module Workflow Guide
Technical Assistance
Technical Assistance
Illumina
5200 Illumina Way
San Diego, California 92122 U.S.A.
+1.800.809.ILMN (4566)
+1.858.202.4566 (outside North America)
[email protected]
www.illumina.com
Was this manual useful for you? yes no
Thank you for your participation!

* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project

Download PDF

advertisement