MAGMA
Here we briefly introduce the main outputs from MAGMA pipeline execution, please note that some outputs are optional and depends mainly on specific parameters to be generated.
Tutorials and Presentations
Tim Huepink and Lennert Verboven created an in-depth tutorial of the features of the variant calling in MAGMA:
We have also included a presentation (in PDF format) of the logic and workflow of the MAGMA pipeline as well as posters that have been presented at conferences. Please refer the docs folder.
Interpretation
The results directory produced by MAGMA is as follows:
/path/to/results_dir/
.
├── QC_statistics
├── analyses
└── vcf_files
QC Statistics Directory
In this directory you will find files related to the quality control carried out by the MAGMA pipeline. The structure is as follows:
/path/to/results_dir/QC_statistics
├── cohort
| └── fastq_validation
│ └── multiqc
│ └── multiqc_data
└── per_sample
├── coverage
├── fastqc
└── mapping
- cohort
Here you will find the joint.merged_cohort_stats.tsv
which contains the QC statistics for all samples in the samplesheet and allows users to determine why certain samples failed to be incorporated in the cohort analysis steps
In addition, you’ll find the cohort-level MultiQC report generated by per_sample/fastqc
analysis and the fastq validation report in json
format.
- per_sample/coverage
Contains the GATK WGSMetrics outputs for each of the samples in the samplesheet
- per_sample/mapping
Contains the FlagStat and samtools stats for each of the samples in the samplesheet
Analysis Directory
/path/to/results_dir/analysis
├── cluster_analysis
├── drug_resistance
├── non-tuberculous_mycobacteria
├── phylogeny
├── spotyping
└── snp_distances
- Cluster Analysis
Contains files related to clustering based on 5SNP and 12SNP cutoffs and inclunding and excluding complex regions .figtree files: These can be imported directly into Figtree for visualisation
- Drug Resistance
Organised based on the different types of variants as well as combined results:
/path/to/results_dir/analysis/drug_resistance
├── combined_resistance_summaries
├── combined_resistance_summaries_mixed_infection_samples
├── major_variants_xbs
├── minor_variants_lofreq
├── structural_variants_delly
└── tbprofiler_fastq
Each of the directories containing results related to the different variants (major | minor | structural) have text files that can be used to annotate the .treefiles produced by MAGMA in iToL (https://itol.embl.de)
The combined resistance results file contains a per-sample drug resistance summary based on the WHO Catalogue of Mtb mutations (https://www.who.int/publications/i/item/9789240082410)
MAGMA also notes the presence of all variants in in tier 1 and tier 2 drug resistance genes.
MAGMA will generated mixed infection reports and also optionally run tbprofiler from the fastq files for comparison purposes.
- Non-Tuberculous Mycobacteria (NTM)
Contains a brief report of NTM presence on the submitted samples, in cohort and per_sample structure.
- Phylogeny
Contains the outputs of the IQTree phylogenetic tree construction.
:memo: By default we recommend that you use the ExDRIncComplex files as MAGMA was optimized to be able to accurately call positions on the edges of complex regions in the Mtb genome
- SNP distances
Contains the SNP distance tables in tsv format.
:memo: By default we recommend that you use the ExDRIncComplex files as MAGMA was optimized to be able to accurately call positions on the edges of complex regions in the Mtb genome
- Spotyping
Contains a spoligotyping pattern prediction using SpoTyping.
vcf_files
Directory
/path/to/results_dir/vcf_files
├── cohort
│ ├── combined_variant_files
│ ├── minor_variants
│ ├── multiple_alignment_files
│ ├── raw_variant_files
│ ├── snp_variant_files
│ └── structural_variants
└── per_sample
├── minor_variants
├── raw_variant_files
└── structural_variants
- Combined variant files
Contains the cohort gvcfs based on major variants detected by the MAGMA pipeline
- Minor variants
Merged vcfs of all samples, generated by LoFreq
- Multiple alignment files
FASTA files for the generation of phylogenetic trees by IQTree
- Raw variant files
Unfiltered indel and SNPs detected by the MAGMA pipeline
- SNP variant files
Filtered SNPs detected by the MAGMA pipeline
- Structural variant files
Unfiltered structural variants detected by the MAGMA pipeline
Libraries Directory
Contains files related to FASTQ validation and FASTQC analysis
Samples Directory
Contains vcf files for major|minor|structural variants for each individual samples