Tools & Pipelines

Comprehensive suite of bioinformatics tools and pipelines for STR analysis, from raw data processing to population genetics.

Contribute a tool or tutorial

Analysis Tools

STRidER
Curated online STR allele-frequency population database providing high-quality genotype probability estimates and autosomal STR quality control.
Multi-platform
QC / Database
Online tool

Key Features:

  • Curated autosomal STR allele frequency database
  • Centralized quality control for population datasets
  • Reliable genotype probability estimation for forensic analysis
Input: STR datasets
Output: QC reports / frequency data
STRNaming
Unbiased method to automatically generate short, informative, and human-readable descriptions of STR alleles.
Multi-platform
Annotation
Online tool

Key Features:

  • Automated generation of standardized STR allele names
  • Sequence-based allele description across loci
  • Human-readable nomenclature for forensic sequencing
Input: STR sequence
Output: Allele nomenclature
FDSTools
Python package for analysis of forensic NGS data: characterisation and filtering of PCR stutter and sequencing noise, and automatic allele detection. Integrates STRNaming for nomenclature.
Illumina
Short-read
Genotyping
Runs locally

Key Features:

  • Stutter and PCR/sequencing noise characterisation and correction
  • Automatic allele detection in targeted MPS data
  • STRNaming-based nomenclature; supports built-in and custom kits
Input: NGS data (e.g. BAM, FASTQ from targeted MPS)
Output: Allele calls, stutter-filtered data, STRNaming nomenclature
HipSTR
STR genotyping from aligned Illumina short-read data (BAM/CRAM) with VCF output.
Illumina
Short-read
Genotyping
Runs locally

Key Features:

  • Haplotype-based STR genotyping with stutter modeling
  • Local realignment of reads around STR loci
  • Joint multi-sample genotyping for population analysis
Input: BAM/CRAM (aligned short reads)
Output: VCF
LongTR
Tandem repeat genotyping from long reads (PacBio HiFi and Oxford Nanopore), inspired by the HipSTR framework and adapted for long-read sequencing data. Outputs bgzipped VCF files.
ONT
Long-read
Genotyping
Runs locally

Key Features:

  • Genotypes tandem repeats (STRs and VNTRs) from long-read BAM/CRAM using a TR regions BED file
  • Workflow options for PacBio HiFi and Oxford Nanopore data
  • Supports phased BAM inputs
Input: BAM/CRAM (sorted & indexed), reference FASTA, TR regions BED
Output: bgzipped VCF
STRspy
ONT-based STR genotyping toolkit with tabular output.
ONT
Long-read
Genotyping
Runs locally

Key Features:

  • STR allele calling from Nanopore long-read sequencing
  • Sequence-level allele resolution using reference databases
  • Designed for forensic STR profiling
Input: FASTQ, BAM (ONT)
Output: TXT/TSV tables
GangSTR
Genome-wide STR genotyping from aligned short-read data with VCF output.
Illumina
Short-read
Genotyping
Runs locally

Key Features:

  • Genome-wide STR genotyping from short-read sequencing
  • Detection of repeat expansions and contractions
  • Statistical modeling of STR length distributions
Input: BAM (short-read aligned)
Output: VCF
STRait Razor
Motif-based STR analysis from Illumina FASTQ with CLI and online version.
Illumina
Short-read
Genotyping
Runs locally
Online tool
Graphical interface

Key Features:

  • Motif-based STR allele detection from FASTQ reads
  • Optimized for forensic STR marker panels
  • Lightweight CLI with optional web interface
Input: FASTQ (Illumina)
Output: Tabular (TSV/CSV/TXT)
toaSTR
Reference-free STR allele analysis tool using sequence-to-structure representations.
Illumina
Short-read
Genotyping
Runs locally

Key Features:

  • Reference-free STR allele inference
  • Sequence-to-structure embedding representations
  • Designed for general STR analysis from sequencing reads
Input: FASTA (repeat definitions), FASTQ (reads)
Output: Tabular allele calls and STR structural summary
NanoMnT
ONT-based STR genotyping from aligned long-read data with locus-level reporting.
ONT
Long-read
Genotyping
Runs locally

Key Features:

  • STR genotyping from Nanopore long-read alignments
  • Locus-level allele and coverage reporting
  • Optimized for noisy long-read sequencing data
Input: BAM (aligned long reads)
Output: Tabular (TSV/CSV/TXT)
STRkit
Long-read STR genotyping toolkit with model-based allele inference.
ONT
Long-read
Genotyping
Runs locally

Key Features:

  • Model-based STR allele length estimation
  • Confidence intervals via statistical bootstrapping
  • Optional phasing with nearby SNVs
Input: FASTQ, BAM (long-read)
Output: Tabular (TSV/CSV/TXT)
NASTRA
Reference-free STR analysis for forensic markers using structural modeling.
ONT
Long-read
Genotyping
Runs locally

Key Features:

  • Structure-aware STR allele calling
  • Reference-free STR detection approach
  • Designed for forensic STR markers
Input: FASTQ, BAM (long-read)
Output: Tabular (TSV/CSV/TXT)
NanoSTR
Targeted STR typing from Nanopore long-read data.
ONT
Long-read
Genotyping
Runs locally

Key Features:

  • Targeted STR genotyping from Nanopore reads
  • Read-length ranking for allele inference
  • Fast processing for targeted STR panels
Input: FASTQ (long-read)
Output: Tabular (TSV/CSV/TXT)

Essential Bioinformatics Commands

Essential Read Processing Commands

For cleaning, filtering, and preparing FASTQ reads before genotyping.

Key Features:
  • Trim adapters and low-quality bases
  • Filter out too-short or poor-quality reads
  • Prepare clean FASTQ files for alignment

Trimmomatic

trimmomatic PE sample_R1.fastq sample_R2.fastq \
  output_R1_paired.fastq output_R1_unpaired.fastq \
  output_R2_paired.fastq output_R2_unpaired.fastq \
  ILLUMINACLIP:adapters.fa:2:30:10 SLIDINGWINDOW:4:20 MINLEN:50

fastp

fastp -i sample_R1.fastq -I sample_R2.fastq \
      -o clean_R1.fastq -O clean_R2.fastq \
      --detect_adapter_for_pe --html report.html

Alignment & BAM Processing Essentials

For aligning reads and generating ready-to-analyze BAM files.

Key Features:
  • High-quality alignment
  • Sorting and indexing
  • BAM cleanup operations

BWA-MEM2 alignment

bwa-mem2 mem reference.fasta sample_R1.fastq sample_R2.fastq > sample.sam

Convert / sort / index

samtools view -bS sample.sam | samtools sort -o sample.sorted.bam
samtools index sample.sorted.bam

Remove duplicates

samtools rmdup sample.sorted.bam sample.rmdup.bam

Inspecting STR Regions & Coverage

For exploring coverage, flanking regions, and STR quality signals.

Key Features:
  • Visualize STR flanking regions
  • Inspect soft-clips and misalignments
  • Evaluate STR coverage depth

Depth coverage

samtools depth -r chr12:100000-100300 sample.bam > depth.txt

Region inspection

samtools view sample.bam chr12:100000-100300

Quick visualization

samtools tview sample.bam reference.fasta

Nanopore (ONT) Essentials

Minimal pipeline from raw ONT signals to aligned reads.

Key Features:
  • Basecall POD5 → reads (unaligned BAM)
  • Align, sort & index BAM (minimap2 + samtools)
  • QC metrics with NanoPlot

Basecalling (POD5 → BAM)

dorado basecaller dna_r10.4.1_e8.2_400bps_sup pod5/ > reads.bam

Alignment to hg38

samtools fastq reads.bam | minimap2 -ax map-ont hg38.fa - | samtools sort -o aln.bam - && samtools index aln.bam

QC with NanoPlot

NanoPlot --bam aln.bam --outdir nanoplot_out/

Installation Requirements

The tools shown above do not come pre-installed. To run these commands, you need to install the corresponding bioinformatics utilities according to your operating system.

Linux (Ubuntu/Debian)

sudo apt update && sudo apt install samtools bcftools minimap2 trimmomatic fastp

macOS (Homebrew)

brew install samtools bcftools minimap2 fastp
brew install --cask trimmomatic

Bioinformatics tools are not supported natively on Windows. Use WSL2 (Ubuntu) or a Linux container for full compatibility.

Windows (WSL2 recommended)

sudo apt update && sudo apt install samtools bcftools minimap2 trimmomatic fastp

Long-read tools may require Python ≥ 3.8 and sufficient disk space for basecalling models.

Dorado installation depends on your platform and GPU availability; obtain precompiled binaries from Oxford Nanopore releases.

Nanopore utilities (POD5 tools, NanoPlot, pycoQC)

pip install pod5 nanoplot pycoqc

A full step-by-step installation guide for each OS will be added soon.

Interactive Tutorials