freyja covariants
Calls physically linked mutations in BAM_FILE using coVar (https://github.com/andersen-lab/covar)
freyja covariants [OPTIONS] INPUT_BAM
Options
- --reference <reference>
- Default:
data/NC_045512_Hu-1.fasta
- --annot <annot>
path to gff file corresponding to reference genome
- Default:
data/NC_045512_Hu-1.gff
- --start_site <start_site>
minimum genomic coordinate to consider
- Default:
0
- --end_site <end_site>
maximum genomic coordinateto consider (defaults to full genome)
- --output <output>
- Default:
covariants.tsv
- --min_quality <min_quality>
minimum quality for a base to be considered
- Default:
20
- --min_depth <min_depth>
minimum count for a set of mutations to be saved
- Default:
10
- --threads <threads>
number of threads to use
- Default:
1
Arguments
- INPUT_BAM
Required argument
Example Usage:
In many cases, it can be useful to study covariant mutations
(i.e. mutations co-occurring on the same read pair). This outputs to a tsv file that includes the mutations present in each
set of covariants, their absolute counts (the number of read pairs with
the mutations), their coverage ranges (the minimum and maximum position
for read-pairs with the mutations), their “maximum” counts (the number
of read pairs that span the positions in the mutations), and their
frequencies (the absolute count divided by the maximum count). The --reference argument defaults
to freyja/data/NC_045512_Hu-1.fasta. If you are using a different
build to perfrom alignment, it is important to pass that file in to
--reference instead. Additionally, a gff file
(e.g. freyja/data/NC_045512_Hu-1.gff) must be included via the
--annot option to output amino acid mutations alongside
nucleotide mutations. Inclusion thresholds for read-mapping quality and
the number of observed instances of a set of covariants can be set using
--min_quality and --min_count respectively.