Creating custom barcodes
Follow these steps to generate lineage‑specific barcodes with BarcodeForge.
- Install dependencies - Conda / Mamba – see <https://mamba.readthedocs.io/en/latest/installation/mamba-installation.html>. 
- BarcodeForge – install from Bioconda: 
 - conda install -c bioconda barcodeforge 
- Prepare input files - Reference genome – FASTA file of the reference sequence. 
- Multiple‑sequence alignment – FASTA of all sequences to be barcoded. 
- Phylogenetic tree – Newick or Nexus file containing every lineage to be barcoded. 
- Lineage table – TSV mapping each sequence ID to its lineage ( - lineage<TAB>sequence_id).
- Barcode prefix (optional) – string to prepend to each barcode (e.g. - RSVafor RSV‑A).
 
- Run BarcodeForge - Basic syntax: - barcodeforge barcode REFERENCE_GENOME ALIGNMENT TREE LINEAGES [OPTIONS] - Common options: - --tree_format {newick,nexus}– tree file format (default:- newick)
- --usher-args "<args>"– extra flags passed to usher
- --threads N– number of CPU cores (default:- 1)
- --matutils-overlap FLOAT– value for- matUtils annotate --set-overlap(default:- 0)
- --prefix TEXT– prefix prepended to lineage names (default: empty)
 - Note - Use - --helpto see every available flag.
- Retrieve the output - The pipeline writes results to the current directory: - barcodes.csv– barcode definitions for each lineage
- barcodes.html– the same barcodes in an interactive HTML format
 - Note - The barcodeforge_workdir folder contains intermediate files generated by the pipeline. 
Worked example: RSV‑A
The following example shows how to generate barcodes for the RSV-A lineage tree:
- Download demo data: - mkdir data wget https://raw.githubusercontent.com/andersen-lab/BarcodeForge/refs/heads/main/barcodeforge/assets/tree.nwk -O data/tree.nwk wget https://raw.githubusercontent.com/andersen-lab/BarcodeForge/refs/heads/main/barcodeforge/assets/aligned.fasta -O data/aligned.fasta wget https://raw.githubusercontent.com/andersen-lab/BarcodeForge/refs/heads/main/barcodeforge/assets/reference.fasta -O data/reference.fasta wget https://raw.githubusercontent.com/andersen-lab/BarcodeForge/refs/heads/main/barcodeforge/assets/lineages.tsv -O data/lineages.tsv 
- Generate barcodes: - barcodeforge barcode data/reference.fasta data/aligned.fasta data/tree.nwk data/lineages.tsv --tree_format newick --threads 4 --prefix RSVa 
- View the results - The barcodes are saved in the current directory: - RSVa-barcodes.csv– barcode definitions for each lineage
- RSVa-barcodes.html– the same barcodes in an interactive HTML format
- barcodeforge_workdir – intermediate files generated by the pipeline