MAGMA Customizable Parameters
This document provides an overview of the customizable parameters for the MAGMA pipeline. Each parameter is listed with its default value, description.
💡 Hint: you may check a full parameters reference file.
Common Parameters
Input Samplesheet
Parameter | Default Value | Description |
---|---|---|
input_samplesheet |
"samplesheet.magma.csv" |
The input CSV file containing sample information. The study ID cannot start with XBS_REF_ . |
💡 Hint: The samplesheet should include the fields
[Sample, R1, R2]
. Optionally, you can add[study, library, attempt, flowcell, lane, index_sequence]
.
Output Directory
Parameter | Default Value | Description |
---|---|---|
outdir |
"magma-results" |
The directory where all output files will be written. |
vcf_name |
"joint" |
The name of the output folder for results. Used to derive JOINT_NAME . |
💡 Note: The
vcf_name
parameter is critical for naming conventions in downstream processes.
Additional Samples Addition
Parameter | Default Value | Description |
---|---|---|
use_ref_gvcf |
true |
Whether to use a reference GVCF file to include additional samples. |
ref_gvcf |
"${projectDir}/resources/ref_gvcfs/LineagesAndOutgroupV2.g.vcf.gz" |
Path to the reference GVCF file. |
ref_gvcf_tbi |
"${projectDir}/resources/ref_gvcfs/LineagesAndOutgroupV2.g.vcf.gz.tbi" |
Path to the index file for the reference GVCF. |
💡 Hint: Use this feature if your dataset has low genetic diversity (e.g., clonal or fewer than 20 samples).
Quality Control Parameters
Parameter | Default Value | Description |
---|---|---|
cutoff_median_coverage |
10 |
The minimal median coverage required to process the sample. |
cutoff_breadth_of_coverage |
0.90 |
The minimal breadth of coverage required to process the sample. |
cutoff_rel_abundance |
0.70 |
The minimal relative abundance of the majority strain required to process the sample. |
cutoff_ntm_fraction |
0.20 |
The maximum fraction of NTM DNA allowed to process the sample. |
⚠️ Attention: Ensure these values are adjusted based on the quality of your input data to avoid processing errors.
Skipping Pipeline Steps
Parameter | Default Value | Description |
---|---|---|
only_validate_fastqs |
false |
Set to true to only validate input FASTQs and check their FASTQC reports. |
skip_merge_analysis |
false |
Skip the final merge analysis step. |
skip_variant_recalibration |
false |
Skip variant quality score recalibration (VQSR). |
skip_base_recalibration |
true |
Skip base quality score recalibration (BQSR). Not suitable for low-coverage Mtb genomes. |
skip_minor_variants_gatk |
true |
Skip minor variants detection with GATK. LoFreq is recommended for most purposes. |
skip_phylogeny_and_clustering |
false |
Disable downstream phylogenetic analysis of merged GVCF. |
skip_complex_regions |
false |
Disable downstream complex region analysis of merged GVCF. |
skip_ntmprofiler |
false |
Disable execution of ntmprofiler on FASTQ files. |
skip_tbprofiler_fastq |
true |
Disable tbprofiler analysis on FASTQ files. |
skip_spotyping |
false |
Disable spoligotyping analysis. |
💡 Hint: Use these flags to customize the pipeline execution based on your specific requirements.
Reference Files
Parameter | Default Value | Description |
---|---|---|
ref_fasta_basename |
"NC-000962-3-H37Rv" |
Basename of the reference FASTA file. |
ref_fasta_dir |
"${projectDir}/resources/genome" |
Directory containing the reference FASTA file. |
ref_fasta |
"${params.ref_fasta_dir}/${params.ref_fasta_basename}.fa" |
Full path to the reference FASTA file. |
ref_fasta_dict |
"${params.ref_fasta_dir}/${params.ref_fasta_basename}.dict" |
Path to the reference FASTA dictionary file. |
ref_fasta_gb |
"${params.ref_fasta_dir}/${params.ref_fasta_basename}.gb" |
Path to the reference GenBank file. |
⚠️ Warning: It is recommended to use the provided reference files to ensure compatibility with the pipeline.