Creating custom barcodes
Follow these steps to generate lineage‑specific barcodes with BarcodeForge.
Install dependencies
Conda / Mamba – see <https://mamba.readthedocs.io/en/latest/installation/mamba-installation.html>.
BarcodeForge – install from Bioconda:
conda install -c bioconda barcodeforge
Prepare input files
Reference genome – FASTA file of the reference sequence.
Multiple‑sequence alignment – FASTA of all sequences to be barcoded.
Phylogenetic tree – Newick or Nexus file containing every lineage to be barcoded.
Lineage table – TSV mapping each sequence ID to its lineage (
lineage<TAB>sequence_id
).Barcode prefix (optional) – string to prepend to each barcode (e.g.
RSVa
for RSV‑A).
Run BarcodeForge
Basic syntax:
barcodeforge barcode REFERENCE_GENOME ALIGNMENT TREE LINEAGES [OPTIONS]
Common options:
--tree_format {newick,nexus}
– tree file format (default:newick
)--usher-args "<args>"
– extra flags passed to usher--threads N
– number of CPU cores (default:1
)--matutils-overlap FLOAT
– value formatUtils annotate --set-overlap
(default:0
)--prefix TEXT
– prefix prepended to lineage names (default: empty)
Note
Use
--help
to see every available flag.Retrieve the output
The pipeline writes results to the current directory:
barcodes.csv
– barcode definitions for each lineagebarcodes.html
– the same barcodes in an interactive HTML format
Note
The barcodeforge_workdir folder contains intermediate files generated by the pipeline.
Worked example: RSV‑A
The following example shows how to generate barcodes for the RSV-A lineage tree:
Download demo data:
mkdir data wget https://raw.githubusercontent.com/andersen-lab/BarcodeForge/refs/heads/main/barcodeforge/assets/tree.nwk -O data/tree.nwk wget https://raw.githubusercontent.com/andersen-lab/BarcodeForge/refs/heads/main/barcodeforge/assets/aligned.fasta -O data/aligned.fasta wget https://raw.githubusercontent.com/andersen-lab/BarcodeForge/refs/heads/main/barcodeforge/assets/reference.fasta -O data/reference.fasta wget https://raw.githubusercontent.com/andersen-lab/BarcodeForge/refs/heads/main/barcodeforge/assets/lineages.tsv -O data/lineages.tsv
Generate barcodes:
barcodeforge barcode data/reference.fasta data/aligned.fasta data/tree.nwk data/lineages.tsv --tree_format newick --threads 4 --prefix RSVa
View the results
The barcodes are saved in the current directory:
RSVa-barcodes.csv
– barcode definitions for each lineageRSVa-barcodes.html
– the same barcodes in an interactive HTML formatbarcodeforge_workdir – intermediate files generated by the pipeline