Tools & Pipelines

Comprehensive suite of bioinformatics tools and pipelines for STR analysis, from raw data processing to population genetics.

Contribute a tool or tutorial

Analysis Tools

STRidER

Curated online STR allele-frequency population database providing high-quality genotype probability estimates and autosomal STR quality control.

Multi-platform

QC / Database

Online tool

Key Features:

• Curated autosomal STR allele frequency database
• Centralized quality control for population datasets
• Reliable genotype probability estimation for forensic analysis

Input: STR datasets

Output: QC reports / frequency data

Website

STRNaming

Unbiased method to automatically generate short, informative, and human-readable descriptions of STR alleles.

Multi-platform

Annotation

Online tool

Key Features:

• Automated generation of standardized STR allele names
• Sequence-based allele description across loci
• Human-readable nomenclature for forensic sequencing

Input: STR sequence

Output: Allele nomenclature

Website

FDSTools

Python package for analysis of forensic NGS data: characterisation and filtering of PCR stutter and sequencing noise, and automatic allele detection. Integrates STRNaming for nomenclature.

Illumina

Short-read

Genotyping

Runs locally

Key Features:

• Stutter and PCR/sequencing noise characterisation and correction
• Automatic allele detection in targeted MPS data
• STRNaming-based nomenclature; supports built-in and custom kits

Input: NGS data (e.g. BAM, FASTQ from targeted MPS)

Output: Allele calls, stutter-filtered data, STRNaming nomenclature

GitHub Website Original publication

HipSTR

STR genotyping from aligned Illumina short-read data (BAM/CRAM) with VCF output.

Illumina

Short-read

Genotyping

Runs locally

Key Features:

• Haplotype-based STR genotyping with stutter modeling
• Local realignment of reads around STR loci
• Joint multi-sample genotyping for population analysis

Input: BAM/CRAM (aligned short reads)

Output: VCF

GitHub Original publication User interface publication

LongTR

Tandem repeat genotyping from long reads (PacBio HiFi and Oxford Nanopore), inspired by the HipSTR framework and adapted for long-read sequencing data. Outputs bgzipped VCF files.

ONT

Long-read

Genotyping

Runs locally

Key Features:

• Genotypes tandem repeats (STRs and VNTRs) from long-read BAM/CRAM using a TR regions BED file
• Workflow options for PacBio HiFi and Oxford Nanopore data
• Supports phased BAM inputs

Input: BAM/CRAM (sorted & indexed), reference FASTA, TR regions BED

Output: bgzipped VCF

GitHub Original publication

STRspy

ONT-based STR genotyping toolkit with tabular output.

ONT

Long-read

Genotyping

Runs locally

Key Features:

• STR allele calling from Nanopore long-read sequencing
• Sequence-level allele resolution using reference databases
• Designed for forensic STR profiling

Input: FASTQ, BAM (ONT)

Output: TXT/TSV tables

GitHub Original publication

GangSTR

Genome-wide STR genotyping from aligned short-read data with VCF output.

Illumina

Short-read

Genotyping

Runs locally

Key Features:

• Genome-wide STR genotyping from short-read sequencing
• Detection of repeat expansions and contractions
• Statistical modeling of STR length distributions

Input: BAM (short-read aligned)

Output: VCF

GitHub Original publication

STRait Razor

Motif-based STR analysis from Illumina FASTQ with CLI and online version.

Illumina

Short-read

Genotyping

Runs locally

Online tool

Graphical interface

Key Features:

• Motif-based STR allele detection from FASTQ reads
• Optimized for forensic STR marker panels
• Lightweight CLI with optional web interface

Input: FASTQ (Illumina)

Output: Tabular (TSV/CSV/TXT)

GitHub Online Version Original publication

toaSTR

Reference-free STR allele analysis tool using sequence-to-structure representations.

Illumina

Short-read

Genotyping

Runs locally

Key Features:

• Reference-free STR allele inference
• Sequence-to-structure embedding representations
• Designed for general STR analysis from sequencing reads

Input: FASTA (repeat definitions), FASTQ (reads)

Output: Tabular allele calls and STR structural summary

GitHub Original publication

NanoMnT

ONT-based STR genotyping from aligned long-read data with locus-level reporting.

ONT

Long-read

Genotyping

Runs locally

Key Features:

• STR genotyping from Nanopore long-read alignments
• Locus-level allele and coverage reporting
• Optimized for noisy long-read sequencing data

Input: BAM (aligned long reads)

Output: Tabular (TSV/CSV/TXT)

GitHub Original publication

STRkit

Long-read STR genotyping toolkit with model-based allele inference.

ONT

Long-read

Genotyping

Runs locally

Key Features:

• Model-based STR allele length estimation
• Confidence intervals via statistical bootstrapping
• Optional phasing with nearby SNVs

Input: FASTQ, BAM (long-read)

Output: Tabular (TSV/CSV/TXT)

GitHub Original publication

NASTRA

Reference-free STR analysis for forensic markers using structural modeling.

ONT

Long-read

Genotyping

Runs locally

Key Features:

• Structure-aware STR allele calling
• Reference-free STR detection approach
• Designed for forensic STR markers

Input: FASTQ, BAM (long-read)

Output: Tabular (TSV/CSV/TXT)

GitHub Original publication

NanoSTR

Targeted STR typing from Nanopore long-read data.

ONT

Long-read

Genotyping

Runs locally

Key Features:

• Targeted STR genotyping from Nanopore reads
• Read-length ranking for allele inference
• Fast processing for targeted STR panels

Input: FASTQ (long-read)

Output: Tabular (TSV/CSV/TXT)

GitHub Original publication

Essential Bioinformatics Commands

Essential Read Processing Commands

For cleaning, filtering, and preparing FASTQ reads before genotyping.

Key Features:

• Trim adapters and low-quality bases
• Filter out too-short or poor-quality reads
• Prepare clean FASTQ files for alignment

Trimmomatic

trimmomatic PE sample_R1.fastq sample_R2.fastq \
  output_R1_paired.fastq output_R1_unpaired.fastq \
  output_R2_paired.fastq output_R2_unpaired.fastq \
  ILLUMINACLIP:adapters.fa:2:30:10 SLIDINGWINDOW:4:20 MINLEN:50

fastp

fastp -i sample_R1.fastq -I sample_R2.fastq \
      -o clean_R1.fastq -O clean_R2.fastq \
      --detect_adapter_for_pe --html report.html

Alignment & BAM Processing Essentials

For aligning reads and generating ready-to-analyze BAM files.

Key Features:

• High-quality alignment
• Sorting and indexing
• BAM cleanup operations

BWA-MEM2 alignment

bwa-mem2 mem reference.fasta sample_R1.fastq sample_R2.fastq > sample.sam

Convert / sort / index

samtools view -bS sample.sam | samtools sort -o sample.sorted.bam
samtools index sample.sorted.bam

Remove duplicates

samtools rmdup sample.sorted.bam sample.rmdup.bam

Inspecting STR Regions & Coverage

For exploring coverage, flanking regions, and STR quality signals.

Key Features:

• Visualize STR flanking regions
• Inspect soft-clips and misalignments
• Evaluate STR coverage depth

Depth coverage

samtools depth -r chr12:100000-100300 sample.bam > depth.txt

Region inspection

samtools view sample.bam chr12:100000-100300

Quick visualization

samtools tview sample.bam reference.fasta

Nanopore (ONT) Essentials

Minimal pipeline from raw ONT signals to aligned reads.

Key Features:

• Basecall POD5 → reads (unaligned BAM)
• Align, sort & index BAM (minimap2 + samtools)
• QC metrics with NanoPlot

Basecalling (POD5 → BAM)

dorado basecaller dna_r10.4.1_e8.2_400bps_sup pod5/ > reads.bam

Alignment to hg38

samtools fastq reads.bam | minimap2 -ax map-ont hg38.fa - | samtools sort -o aln.bam - && samtools index aln.bam

QC with NanoPlot

NanoPlot --bam aln.bam --outdir nanoplot_out/

Installation Requirements

The tools shown above do not come pre-installed. To run these commands, you need to install the corresponding bioinformatics utilities according to your operating system.

Linux (Ubuntu/Debian)

sudo apt update && sudo apt install samtools bcftools minimap2 trimmomatic fastp

macOS (Homebrew)

brew install samtools bcftools minimap2 fastp
brew install --cask trimmomatic

Bioinformatics tools are not supported natively on Windows. Use WSL2 (Ubuntu) or a Linux container for full compatibility.

Windows (WSL2 recommended)

sudo apt update && sudo apt install samtools bcftools minimap2 trimmomatic fastp

Long-read tools may require Python ≥ 3.8 and sufficient disk space for basecalling models.

Dorado installation depends on your platform and GPU availability; obtain precompiled binaries from Oxford Nanopore releases.

Nanopore utilities (POD5 tools, NanoPlot, pycoQC)

pip install pod5 nanoplot pycoqc

A full step-by-step installation guide for each OS will be added soon.

Interactive Tutorials

Coming soon