MAGMA

Here we briefly introduce the main outputs from MAGMA pipeline execution, please note that some outputs are optional and depends mainly on specific parameters to be generated.

Tutorials and Presentations

Tim Huepink and Lennert Verboven created an in-depth tutorial of the features of the variant calling in MAGMA:

Video

We have also included a presentation (in PDF format) of the logic and workflow of the MAGMA pipeline as well as posters that have been presented at conferences. Please refer the docs folder.

Interpretation

The results directory produced by MAGMA is as follows:

/path/to/results_dir/
.
├── QC_statistics
├── analyses
└── vcf_files

QC Statistics Directory

In this directory you will find files related to the quality control carried out by the MAGMA pipeline. The structure is as follows:

/path/to/results_dir/QC_statistics
├── cohort
|   └── fastq_validation
│   └── multiqc
│       └── multiqc_data
└── per_sample
    ├── coverage
    ├── fastqc
    └── mapping
  • cohort

Here you will find the joint.merged_cohort_stats.tsv which contains the QC statistics for all samples in the samplesheet and allows users to determine why certain samples failed to be incorporated in the cohort analysis steps

In addition, you’ll find the cohort-level MultiQC report generated by per_sample/fastqc analysis and the fastq validation report in json format.

  • per_sample/coverage

Contains the GATK WGSMetrics outputs for each of the samples in the samplesheet

  • per_sample/mapping

Contains the FlagStat and samtools stats for each of the samples in the samplesheet

Analysis Directory

/path/to/results_dir/analysis
├── cluster_analysis
├── drug_resistance
├── non-tuberculous_mycobacteria
├── phylogeny
├── spotyping
└── snp_distances
  • Cluster Analysis

Contains files related to clustering based on 5SNP and 12SNP cutoffs and inclunding and excluding complex regions .figtree files: These can be imported directly into Figtree for visualisation

  • Drug Resistance

Organised based on the different types of variants as well as combined results:

/path/to/results_dir/analysis/drug_resistance
├── combined_resistance_summaries
├── combined_resistance_summaries_mixed_infection_samples
├── major_variants_xbs
├── minor_variants_lofreq
├── structural_variants_delly
└── tbprofiler_fastq

Each of the directories containing results related to the different variants (major | minor | structural) have text files that can be used to annotate the .treefiles produced by MAGMA in iToL (https://itol.embl.de)

The combined resistance results file contains a per-sample drug resistance summary based on the WHO Catalogue of Mtb mutations (https://www.who.int/publications/i/item/9789240082410)

MAGMA also notes the presence of all variants in in tier 1 and tier 2 drug resistance genes.

MAGMA will generated mixed infection reports and also optionally run tbprofiler from the fastq files for comparison purposes.

  • Non-Tuberculous Mycobacteria (NTM)

Contains a brief report of NTM presence on the submitted samples, in cohort and per_sample structure.

  • Phylogeny

Contains the outputs of the IQTree phylogenetic tree construction.

:memo: By default we recommend that you use the ExDRIncComplex files as MAGMA was optimized to be able to accurately call positions on the edges of complex regions in the Mtb genome

  • SNP distances

Contains the SNP distance tables in tsv format.

:memo: By default we recommend that you use the ExDRIncComplex files as MAGMA was optimized to be able to accurately call positions on the edges of complex regions in the Mtb genome

  • Spotyping

Contains a spoligotyping pattern prediction using SpoTyping.

vcf_files Directory

/path/to/results_dir/vcf_files
├── cohort
│   ├── combined_variant_files
│   ├── minor_variants
│   ├── multiple_alignment_files
│   ├── raw_variant_files
│   ├── snp_variant_files
│   └── structural_variants
└── per_sample
    ├── minor_variants
    ├── raw_variant_files
    └── structural_variants
  • Combined variant files

Contains the cohort gvcfs based on major variants detected by the MAGMA pipeline

  • Minor variants

Merged vcfs of all samples, generated by LoFreq

  • Multiple alignment files

FASTA files for the generation of phylogenetic trees by IQTree

  • Raw variant files

Unfiltered indel and SNPs detected by the MAGMA pipeline

  • SNP variant files

Filtered SNPs detected by the MAGMA pipeline

  • Structural variant files

Unfiltered structural variants detected by the MAGMA pipeline

Libraries Directory

Contains files related to FASTQ validation and FASTQC analysis

Samples Directory

Contains vcf files for major|minor|structural variants for each individual samples