freyja demix
Generate relative lineage abundances from VARIANTS and DEPTHS
freyja demix [OPTIONS] VARIANTS DEPTHS
Options
- --eps <eps>
minimum abundance to include for each lineage
- Default:
0.001
- --barcodes <barcodes>
Path to custom barcode file
- --meta <meta>
custom lineage to variant metadata file
- --output <output>
Output file
- Default:
demixing_result.tsv
- --covcut <covcut>
calculate percent of sites with n or greater reads
- Default:
10
- --confirmedonly
exclude unconfirmed lineages
- Default:
False
- --version
- Default:
False
- --depthcutoff <depthcutoff>
exclude sites with coverage depth below this value andgroup identical barcodes
- Default:
0
- --lineageyml <lineageyml>
lineage hierarchy file in a yaml format
- --adapt <adapt>
adaptive lasso penalty parameter
- Default:
0.0
- --a_eps <a_eps>
adaptive lasso parameter, hard threshold
- Default:
1e-08
- --region_of_interest <region_of_interest>
JSON file containing region(s) of interest for which to compute additional coverage estimates
- --relaxedmrca
for use with depth cutoff,clusters are assigned robust mrca to handle outliers
- Default:
False
- --relaxedthresh <relaxedthresh>
associated threshold for robust mrca function
- Default:
0.9
- --solver <solver>
solver used for estimating lineage prevalence
- Default:
CLARABEL
- --pathogen <pathogen>
Pathogen of interest.Not used if using –barcodes option.
- Default:
SARS-CoV-2
- Options:
SARS-CoV-2 | MPXV | H5NX | H1N1pdm | FLU-B-VIC | MEASLESN450 | MEASLES | RSVa | RSVb
- --autoadapt
use error profile to set adapt
- Default:
False
Arguments
- VARIANTS
Required argument
- DEPTHS
Required argument
Example Usage:
After running freyja variants
we can run:
freyja demix [variants-file] [depth-file] --output [output-file]
This outputs to a tsv file that includes the lineages present, their
corresponding abundances, and summarization by constellation. This
method also includes a --eps
option, which enables the user to
define the minimum lineage abundance returned to the user
(e.g. --eps 0.0001
). A custom barcode file can be provided using the
--barcodes [path-to-barcode-file]
option. By default, freyja uses
the lineage hierarchy file located infreyja/data
directory which
is updated everytime the freyja update
command is run. The user,
however, can define a custom lineage hierarchy file
using--lineageyml [path-to-lineage-file]
. Users can get the
historic lineage.yml
file at freyja-data GitHub repository
here.
As the UShER tree now included proposed lineages, we now offer the
--confirmedonly
flag which removes unconfirmed lineages from the
analysis. For additional flexibility and reproducibility of analyses, a
custom lineage-to-constellation mapping metadata file can be provided
using the --meta
option. A coverage depth minimum can be specified
using the --depthcutoff
option, which excludes sites with coverage
less than the specified value. An example output should have the format
filename |
|
---|---|
summarized |
[(‘Delta’, 0.65), (‘Other’, 0.25), (‘Alpha’, 0.1)] |
lineages |
[‘B.1.617.2’ ‘B.1.2’ ‘AY.6’ ‘Q.3’] |
abundances |
“[0.5 0.25 0.15 0.1]” |
resid |
3.14159 |
coverage |
95.8 |
Where summarized
denotes a sum of all lineage abundances in a particular WHO designation (i.e. B.1.617.2 and AY.6 abundances are summed in the above example), otherwise they are grouped into “Other”. The lineage
array lists the identified lineages in descending order, and abundances
contains the corresponding abundances estimates. Using the --depthcutoff
option may result in some distinct lineages now having identical barcodes, which are grouped into the format [lineage]-like(num)
(based on their shared phylogeny) in the output. A summary of this lineage grouping is outputted to [output-file]_collapsed_lineages.yml
. The value of resid
corresponds to the residual of the weighted least absolute deviation problem used to estimate lineage abundances. The coverage
value provides the 10x coverage estimate (percent of sites with 10 or greater reads- 10 is the default but can be modfied using the --covcut
option in demix
). If there is an solver error during the demix step (generally associated with poor data quality), an error message will be returned, along with an output empty summarized, lineages, and abundances, and with resid = -1.
NOTE: The freyja variants
output is stable in time, and does not need to be re-run to incorporate updated lineage designations/corresponding mutational barcodes, whereas the outputs of freyja demix
will change as barcodes are updated (and thus demix
should be re-run as new information is made available).