Software/list
Below is (one of many possible) dynamic tables of software data, created from pages in the wiki. To add a package to the list, use the following form:
| Name | Summary | Bio Tags | Meth Tags | Features | Language | Licence | OS |
|---|---|---|---|---|---|---|---|
| 4peaks | Allows viewing sequencing trace files, motif searching trimming, BLAST and exporting sequences. | Sequencing | Sequence analysis | Freeware | Mac OS X | ||
| AB Large Indel Tool | Identifies deviations in clone insert size that indicate intra-chromosomal structural variations compared to a reference genome. | InDel discovery Sequencing |
Mapping | Perl | GPL | Linux 64 | |
| AB Small Indel Tool | The SOLiD™ Small Indel Tool processes the indel evidences found in the pairing step of the SOLiD™ System Analysis Pipeline Tool (Corona Lite). | InDel discovery Sequencing |
Mapping Alignment |
Perl C++ |
GPL | Linux 64 | |
| ABBA | Assembly Boosted By Amino acid sequence is a comparative gene assembler, which uses amino acid sequences from predicted proteins to help build a better assembly | Genomic Assembly | Assembly Scaffolding |
Artistic License | Linux | ||
| ABMapper | Maps RNA-Seq reads to target genome considering possible multiple mapping locations and splice junctions | Genomics Transcriptomics |
Mapping Alignment |
C++ Perl |
GPLv3 | Linux | |
| ABySS | ABySS is a de novo sequence assembler designed for short reads and large genomes. | De-novo assembly | Assembly De Bruijn graph |
MPI OpenMP |
C++ | Commercial Freeware |
POSIX Linux Mac OS X |
| Adapter Removal (software) | Removes adaptor fragments from raw short read sequence data and outputs data to FASTA format. | General bioinformatics (pipeline) | Adapter Removal (software) | Trimming | Java | Custom Licence | Linux 64 Windows Mac OS X |
| AGE | AGE is a tool that implements an algorithm for optimal alignment of sequences with SVs. | Structural variation | Alignment Gap extension |
Creative Commons license (Attribution-NonCommerical). | |||
| AGILE | A hash table based high throughput sequence mapping algorithm for longer 4A54 reads that uses diagonal multiple seed-match criteria, customized q-gram filtering and a dynamic incremental search approach among other heuristics to optimize every step of the mapping process | Mapping | C | ||||
| Agp2amos | missing | Format conversion | Windows Linux |
||||
| Alcovna | ALgorithms for COmparing and Visualizing Non Assembled data | SNP discovery | Java | ||||
| ALEXA-Seq | Alternative Expression Analysis by massively parallel RNA sequencing | RNA-Seq Quantitation Alternative Splicing |
Perl | GPLv3 | |||
| ALLPATHS | De novo assembly of whole-genome shotgun microreads. | De-novo assembly | Assembly De Bruijn graph |
||||
| Alta-Cyclic | Alta-Cyclic is a Illumina Genome-Analyzer (Solexa) base caller. | Basecaller | |||||
| AMOS | AMOS is a Modular, Open-Source whole genome assembler. | Assembly Assembly validation Assembly visualization Format conversion Integrated Solution |
C Perl |
Linux | |||
| ANCHOR | Post-processing tools for de novo assemblies | Assembly Assembly QC |
C++ Python |
BCCA (academic use) | Linux | ||
| Anno-J | Annotation Browsing 2.0 | Visualization | Creative Commons - Attribution-NonCommercial-ShareAlike | ||||
| ANNOVAR | ANNOVAR: Functional annotation of genetic variants from high-throughput sequencing data | Genomics Genetics |
Annotation Variant Prioritization |
Gene-based annotation region-based annotation filter-based annotation |
Perl | Commercial Freeware |
Linux Windows Mac OS X |
| Arachne | ARACHNE is a program for assembling data from whole genome shotgun sequencing experiments. | Assembly OLC |
|||||
| AREM | AREM: Aligning Short Reads from ChIP-Sequencing by Expectation Maximization | ChIP-Seq | Peak calling Mapping |
Python | Linux | ||
| Arf | arf is a genetic analysis program for sequencing data. | ||||||
| Array Suite (Array Studio/Server) | Array Studio is a complete analysis and visualization package for NextGen sequencing data, as well as other -OMIC data types. Array Server is a backend enterprise server for storage and analysis of -OMIC and NextGen sequencing data. | Genomics SNP discovery InDel discovery |
Mapping Expression profiling |
Data Visualisation Variant annotation and analysis coverage analysis Mapping |
C# | Commercial | Windows |
| ArrayExpressHTS | R-based pipeline for RNA-Seq data analysis. | RNA-Seq RNA-Seq Quantitation |
R | ||||
| ArrayStar | ArrayStar is an easy-to-use gene expression analysis software package that offers powerful visualization and statistical tools to help you analyze your microarray data. | Gene Expression Analysis | Differentially expressed gene identification Gene ontology analysis Sequence variation analysis Statistics |
Commercial | Windows Mac OS X 10.6 with Parallels Desktop |
||
| ASC | Empirical Bayes method to detect differential expression. | RNA-Seq Quantitation | Empirical Bayes | ||||
| ATAC | ATAC is a computational process for comparative mapping between two genome assemblies, or between two different genomes. | Assembly validation Alignment |
Linux | ||||
| Atlas Suite | Atlas is a suite of variant analysis tools specializing in the separation of true SNPs and insertions and deletions (indels) from sequencing and mapping errors in Whole Exome Capture Sequecing (WECS) data. SNPs may be called using the Atlas-SNP2 application and indels may be called using the Atlas-Indel2 application. | SNP discovery InDel discovery |
Variant Calling | Ruby C |
BSD | POSIX | |
| Atlas-SNP2 | Atlas-SNP2 is a SNP discovery tool developed for next generation sequencing platforms | SNP discovery | Ruby | Freeware | UNIX | ||
| Avadis NGS | Avadis NGS is a desktop software platform for alignment, analysis, visualization, and management of data generated by next-generation sequencing (NGS) platforms. It supports workflows for RNA-Seq, DNA-Seq, and ChIP-Seq analysis and is designed with the biologist in mind. | ChIP-Seq DNA-Seq RNA-Seq Small RNA Pathway analysis |
Alignment Quality Control Sequence analysis Visualization Biological Contextualization |
Rich Visualization Identify effects of SNPs on transcripts Identify Structural Variants from Paired Reads (Insertions Deletions Translocations Inversions) Identify binding site peaks in ChIP-Seq data Identify motifs around binding sites Determine gene expression levels and identify differentially expressed genes De-convolve transcript expression levels and identify differential splice variants Identify Novel Exons Identify Novel Splice Junctions Identify Fusion Genes Perform QC on Reads determine on-and off-target reads and filter anomalous reads Determine Enriched GO Terms Determine Significant Pathways |
Java R |
Commercial | Windows Linux Mac OS X |
| Baa.pl | use transcripts to assess a de novo assembly | Genomic Assembly Evaluation Genomic Assembly Validation |
Alignment Analysis | Perl | GPL | any | |
| Bambino | Variant detector and graphical alignment viewer for SAM/BAM format data. | SNP discovery Somatic mutations |
Java | ||||
| Bambus | Bambus is a general purpose scaffolder | Scaffolding | |||||
| BAMseek | BAMseek is a large file viewer for BAM and SAM alignment files. | Genomics Transcriptomics |
Alignment viewer | Java | GPLv3 | Cross-Platform | |
| BamTools | BamTools provides a fast, flexible C++ API & toolkit for reading, writing, and managing BAM files. | Programming Library Alignment Analysis |
C++ | MIT | Cross-Platform | ||
| BamView | Interactive Java application for visualising the large amounts of data stored for sequence reads which are aligned against a reference genome sequence | Visualization | Java | GPL | Mac OS X UNIX Windows |
||
| Barcode generator | Generator of sequence barcodes suitable for Illumina sequencing. | Sample Barcoding | Python | ||||
| Barcrawl Bartab | Barcrawl facilitates the design of barcoded primers, for multiplexed high-throughput sequencing. | Sample Barcoding | GPL | ||||
| BarraCUDA | Barracuda is a high-speed sequence aligner based on BWA and utilizes the latest Nvidia CUDA architecture for accelerating alignments of sequence reads generated by the next-generation sequencers. | Sequence analysis | Mapping Alignment FM-Index GPU |
Gapped and ungapped alignment paired-end mapping GPGPU parallel execution |
C C++ CUDA |
GPLv3 MIT |
Linux |
| Batman | Bayesian tool for methylation analysis (Batman) for analyzing methylated DNA immunoprecipitation (MeDIP) profiles | DNA methylation | Java | LGPL | |||
| BayesCall | Bayesian basecaller | Sequencing | Basecaller | C++ Python |
GPLv3 | ||
| BayesPeak | A Bayesian hidden Markov model to detect enriched locations in ChIP-seq data. | ChIP-Seq | Hidden Markov Model MCMC |
Multicore | R | GPL | |
| BaySeq | Identify differential expressed genes | RNA-Seq Quantitation | Differentially expressed gene identification | R | |||
| BBSeq | Tool for analyzing RNA-Seq data to analyze gene expression | RNA-Seq Quantitation | R | ||||
| Bcbio-nextgen | Python scripts and modules for automated next gen sequencing analysis. These provide a fully automated pipeline for taking sequencing results from an Illumina sequencer, converting them to standard Fastq format, aligning to a reference genome, doing SNP calling, and producing a summary PDF of results. | General bioinformatics (pipeline) | QC Filtering Trimming Mapping Peak calling Motif detection Differential expression Genomic region matching Aligning Genotyping |
Python | MIT | platform-independent | |
| BEADS | ChIP-Seq data normalization for Illumina | ChIP-Seq | Normalization | ||||
| BEAP | The Blast Extension and Assembly Program (BEAP) uses a short starting DNA fragment to recursively blast nucleotide databases to obtain all sequences that overlaps to construct the a "full length" sequence. | Mapping | |||||
| BEDTools | BEDTools is an extensive suite of utilities for comparing genomic features in BED format. | Genomics | Mapping | Feature overlaps UNIX pipes coverage split-alignments BAM support |
C++ | GPLv2 | Linux Mac OS X |
| Bedutils | NGSUtils is a suite of software tools for working with next-generation sequencing datasets. Staring in 2009, we (Liu Lab @ Indiana University School of Medicine) starting working with next-generation sequencing data. We initially started doing custom coding for each project in a one-off manner. It quickly became apparent that this was an inefficient manner to work, so we started assembling smaller utilities that could be adapted into larger, more complicated, workflows. We have used them for Illumia, SOLiD and 454 sequencing data. We have used them for DNA and RNA resequcing, ChIP-Seq, CLIP-Seq, and targeted resequencing (Agilent exome capture and PCR targeting). These tools are also used heavily in our in-house DNA and RNA mapping pipelines.
These tools have of great use within our lab group, and so we are happy to make them available to the greater community. NGSUtils is made up of 50+ programs, mainly written in Python. These are separated into modules based on the type of file that is to be analyzed. There are four modules: |
||||||
| Belvu | An X-windows viewer for multiple sequence alignments | Multiple sequence alignment viewer | Linux | ||||
| BFAST | Blat-like Fast Accurate Search Tool. | Whole Genome Resequencing | Mapping Alignment Genome Indexing Colorspace |
parallel execution command line |
C | GPL | Solaris UNIX |
| BFCounter | BFCounter is a program for counting k-mers in DNA sequence data. | K-mer analysis | C++ | ||||
| BING | biomedical informatics pipeline (BING) for the analysis of NGS data that offers several novel computational approaches to 1. image alignment, 2. signal correlation, compensation, separation, and pixel-based cluster registration, 3. signal measurement and base calling, 4. quality control and accuracy measurement. | Basecaller Sequencing Quality Control |
|||||
| Bionimbus | Cloud environment for analysis of microarray and second generation sequencing data. | Linux Amazon EC2 cloud |
|||||
| Biopieces | The Biopieces are a collection of bioinformatics tools that can be pieced together in a very easy and flexible manner to perform both simple and complex tasks. The Biopieces work on a data stream in such a way that the data stream can be passed through several different Biopieces, each performing one specific task: modifying or adding records to the data stream, creating plots, or uploading data to databases and web services. | Genomics | Alignment Quality Control Sequence analysis Visualization |
Perl Python Ruby C |
GPLv2 | ||
| Biopython | Biopython provides a tool kit for writing bioinformatics and computational molecular biology software in Python. | Sequence analysis Phylogenetics Population genetics Protein structures |
Sequence parsing Command line tool wrappers Programming Library |
Various | Python | Biopython License (MIT/BSD style) | Linux Windows Mac OS X |
| BioSmalltalk | BioSmalltalk provides an environment to build bioinformatics scripts and applications using the most powerful object technology as of today, the Smalltalk programming environment | Sequence analysis Phylogenetics Population genetics Protein structures |
Sequence parsing Command line tool wrappers Programming Library |
Various | Smalltalk | Linux Windows Mac OS X |
|
| BiQ Analyzer | BiQ Analyzer is a software tool for easy visualization and quality control of DNA methylation data. With more than 2,000 downloads so far, BiQ Analyzer has become a standard tool for processing DNA methylation data from bisulfite sequencing. | Epigenomics DNA methylation |
Java | Windows Linux Mac OS X Solaris |
|||
| BiQ Analyzer HT | BiQ Analyzer HT is an enhanced version of BiQ Analyzer that provides extensive support for high-throughput bisulfite sequencing. BiQ Analyzer HT facilitates the processing, quality control and initial analysis of single-basepair resolution DNA methylation data. It was developed for deep bisulfite sequencing of one or more loci using the Roche 454 platform, but it easily extends to other sequencing platforms. BiQ Analyzer HT features a biologist-friendly graphical user interface, a fast alignment algorithm and a variety of ways to visualize DNA methylation data. | Epigenomics DNA methylation Bisulfite Sequencing |
Java | Windows Linux Mac OS X Solaris |
|||
| Bis-SNP | BisSNP is a package based on the Genome Analysis Toolkit (GATK) map-reduce framework for genotyping in bisulfite treated massively parallel sequencing (Bisulfite-seq, NOMe-seq and RRBS) on Illumina platform. It uses bayesian inference with either manually specified or automatically estimated methylation probabilities of different cytosine context(not only CpG, CHH, CHG in Bisulfite-seq, but also GCH et.al. in other bisulfite treated sequencing) to determine genotypes and methylation levels simultaneously. | SNP discovery Genotyping DNA methylation Bisulfite Sequencing |
Bisulfite SNP calling Methylation Calling MapReduce |
Accurate SNP and methylation calling in Bisulfite-seq/NOMe-seq/RRBS | Java Perl |
MIT | Linux Mac OS X |
| Bismark | Bismark is a tool to map bisulfite treated sequencing reads and perform methylation calling in a quick and easy-to-use fashion. | Epigenomics Genomics DNA methylation |
Bisulfite mapping Mapping Methylation Calling |
fast and convenient Bisulfite-Seq output very flexible |
Perl | GPLv3 | Linux Mac OS X Windows |
| BLAST | ...it's BLAST. | Linux | |||||
| BLAT | Fast, accurate spliced alignment of DNA sequences | Mapping Alignment |
Mapping | C | Freeware | Linux Mac OS X |
|
| Blixem | a graphical blast viewer | Sequence analysis Phylogenetics Homology |
Alignment viewer Multiple sequence alignment viewer |
GPL | Linux | ||
| BOAT | Can accurately and efficiently map sequencing reads back to the reference genome. | Mapping | GPL | ||||
| Bort | Bort parses Blast output and quantifies hits by contig and read counts. | RNA-Seq Quantitation | Perl | any | |||
| Bowtie | Bowtie is an ultrafast, memory-efficient short read aligner. | Mapping Burrows-Wheeler FM-Index |
Mac OS X Linux Windows |
||||
| BRAT | accurate and efficient tool for mapping short reads obtained from the Illumina Genome Analyzer following sodium bisulfite conversion. Both single and paired ends are supported. | Epigenomics DNA methylation |
Bisulfite mapping Mapping |
GPLv3 | |||
| BRCA-diagnostic | Computational screening test for BRCA1/2 mutants in human genomic DNA | Personal genomics | Perl | ||||
| BreakDancer | BreakDancer is an application for detecting structural rearrangements and indels in short read sequencing data | Genomics Structural variation InDel discovery |
Perl C++ |
GPLv3 | |||
| Breakpointer | Breakpointer is a fast tool for locating sequence breakpoints from the alignment of single end reads (SE) produced by next generation sequencing (NGS). It adopts a heuristic method in searching for local mapping signatures created by insertion/deletions (indels) or more complex structural variants(SVs). With current NGS single-end sequencing data, the output regions by Breakpoint mainly contain the approximate breakpoints of indels and a limited number of large SVs. | Exome and Whole genome variant detection InDel discovery |
Statistical testing | C++ Perl |
GPL | ||
| BreakSeq | Database of known human breakpoint junctions and software to search short reads against them. | Structural variation | Mapping | ||||
| Breakway | Breakway is a suite of programs that take aligned genomic data and report structural variation breakpoints. | Whole Genome Resequencing Genomics Structural variation InDel discovery |
Sequence analysis Genetic variation annotation |
Fast specific UNIX pipes |
Perl | GPL | Linux Mac OS X Windows |
| BS Seeker | Mapping tool for bisulfite treated reads | Epigenomics | Bisulfite mapping | Python | |||
| BS-Seq | The source code and data for the "Shotgun Bisulphite Sequencing of the Arabidopsis Genome Reveals DNA Methylation Patterning" Nature paper by Cokus et al. (Steve Jacobsen's lab at UCLA). POSIX. | Epigenomics | Bisulfite mapping | ||||
| BSMAP | short reads mapping software for bisulfite sequencing | DNA methylation | Mapping Bisulfite mapping |
Bisulfite sequencing | C++ | GPLv3 | Linux 64 |
| BSSim | BSSim: Bisulfite sequencing simulator for next-generation sequencing. | DNA methylation Bisulfite Sequencing |
Simulation | BSSim can allow users to mimic various methylation level. | Python | GPL v3 | UNIX Linux Mac OS X Windows |
| Btrim | Btrim is a fast and lightweight software to trim adapters and low quality regions in reads. | Trimming | Linux | ||||
| BWA | Fast, accurate, memory-efficient aligner for short and long sequencing reads | Mapping Read alignment |
FM-Index | Gapped alignment paired-end mapping |
C | GPLv3 MIT |
UNIX |
| BWA-SW | Fast, accurate, memory-efficient aligner for long sequencing reads | Mapping Read alignment |
FM-Index | Gapped alignment Local alignment |
C | GPLv3 MIT |
UNIX |
| CABOG | Celera Assembler is scientific software for DNA research. | De-novo assembly | Assembly | Robust to homopolymer run length | Linux | ||
| CANGS | CANGS is a flexible and user-friendly utility to trim sequences, filter low quality sequences, and produce input files for further downstream analyses for 454 sequences. CANGS can be used to assign the taxonomic grouping based on similarity with sequences from the NCBI database | Metagenomics Phylogenetics |
Primer removal Trimming Sequencing Quality Control |
||||
| CARPET | A web‐based package for the analysis of ChIP‐chip and expression tiling data | ChIP-on-chip | Tilling | C++ | |||
| CASHX | Parse, map, quantify and manage large quantities of short-read sequence data. | Small RNA transcriptome | Mapping | ||||
| CATCH | A tool for exploring patterns in ChIP profiling data. | ChIP-Seq ChIP-on-chip |
Clustering and alignment | parallel execution graphical browsing of results |
Java | Open Source | |
| CatchAll | Estimate ecological diversity with both parametric and non-parametric estimators. | Population genetics Metagenomics |
|||||
| CGA Tools | Tools for viewing, manipulating and converting data from Complete Genomics | Conversion | C++ | Apache License 2.0 | Linux UNIX Mac OS X |
||
| ChimeraScan | Identifies chimaeric transcripts in RNA-Seq data | Fusion transcripts | |||||
| ChIP-Seq (application) | The ChIP-Seq web server provides access to a set of useful tools performing common ChIP-Seq data analysis tasks, including positional correlation analysis, peak detection, and genome partitioning into signal-rich and signal-poor regions. It is an open system designed to allow interoperability with other resources, in particular the motif discovery programs from the Signal Search Analysis (SSA) server. | ChIP-Seq Analysis | Read Mapping and Tag Distribution Analysis | C Perl |
GPL | Linux Mac OS X |
|
| ChIPmeta | Combining data from ChIP-seq and ChIP-chip. | Transcription Factor Binding Site identification ChIP-Seq ChIP-on-chip |
Hidden Markov Model | ||||
| ChIPMunk | ChIPMunk is a fast heuristic DNA motif digger based a on greedy approach accompanied by bootstrapping. ChIPMunk identifies the strong motif with the maximum Kullback Discrete Information Content in a given set of DNA sequences. *NEW URL* http://autosome.ru/ChIPMunk | ChIP-Seq Motif analysis Motif discovery |
Motif analysis Motif discovery |
efficient motif discovery for huge datasets up to tens of thousands of sequences; multi-core CPU support; usage of the ChIP-Seq base coverage peak data | Java | Freeware | platform-independent |
| CHiPSeq | From Science Johnson, 2007 | ChIP-Seq | Peak calling | ||||
| ChIPseqR | ChIP-seq qanalysis tool | ChIP-Seq | R | ||||
| Chipster | User-friendly NGS data analysis software with built-in genome browser and workflow functionality. Chipster includes tools for ChIP-seq, RNA-seq, miRNA-seq and MeDIP-seq analysis, and functionality for exome-seq and CGH-seq will soon be added. | ChIP-Seq RNA-Seq MiRNA-Seq MeDIP-Seq |
QC Filtering Trimming Mapping Peak calling Motif detection Differential expression Pathway analysis Methylation analysis Genomic region matching Genome browser |
Java R |
GPLv3 | platform-independent | |
| ChromaSig | An unsupervised learning method, which finds, in an unbiased fashion, commonly occurring chromatin signatures in both tiling microarray and sequencing data. | ChIP-on-chip | Chromatin motif finding | Perl C |
|||
| ChromHMM | ChromHMM is software for learning and characterizing chromatin states. | Epigenomics | Hidden Markov Model Segmentation |
Java | GPL 2 | ||
| CisGenome | An integrated tool for tiling array, ChIP-seq, genome and cis-regulatory element analysis | ChIP-Seq ChIP-on-chip Motif analysis Gene annotation retrieval |
Gibbs motif sample | C C++ |
UNIX Windows |
||
| Cistrome | Galaxy-based web service for analysis of ChIP data | ChIP-on-chip ChIP-Seq |
Python | ||||
| CLCbio Genomics Workbench | De novo and reference assembly SNP and small indel detection and annotation. | Genomics Whole Genome Resequencing De-novo assembly SNP discovery InDel discovery ChIP-Seq RNA-Seq Alignment RNA-Seq MiRNA Alignment Transcriptomics |
Mapping Assembly Alignment Colorspace BLAST Ab-inito gene prediction Adapter Removal (software) Annotation Assembly QC Basespace Bisulfite SNP calling De Bruijn graph Heatmaps |
Advanced and user-friendly analyses of genomic transcriptomic and epigenomic NGS data in a graphical user-interface. Wizard driven tools and a freely available developer toolkit SIMD implementation multi-threading hybrid assembly Integrated solution |
Java C++ |
Commercial | Windows Mac OS X Linux |
| Clean reads | clean_reads cleans NGS (Sanger, 454, Illumina and solid) reads. | Trimming Sequencing Quality Control |
Python | ||||
| CleaveLand | A pipeline for using degradome data to find cleaved small RNA targets. | MiRNA | Perl R |
Freeware | |||
| CLEVER | CLEVER is a tool to discover structural variations such as (larger) insertions and deletions in genomes from paired-end sequencing reads. | Genomics Structural variation Copy number estimation |
Structural variation discovery | command line | C++ Python |
GPLv3 | any |
| ClipCrop | a new method and implementation named ClipCrop for detecting SVs with single-base resolution | ||||||
| CloudAligner | Hadoop-based short read aligner | Mapping Hadoop |
Java | GPL | cloud | ||
| CloudBurst | CloudBurst is a parallel read-mapping algorithm optimized for mapping next-generation sequence data to the human genome and other reference genomes. | SNP discovery Genotyping Personal genomics |
Mapping MapReduce Hadoop |
parallel execution Hadoop Academic Cloud Computing Initiative |
Java | ||
| ClustDB | A powerful tool for exact sequence matching | Linux | |||||
| CNANorm | A normalization method for Copy Number Aberration in cancer samples. | Cancer biology Copy number estimation Genomics |
Mixture model Peak detection Normalization |
R Perl |
GPLv2 | Linux Mac OS X Windows |
|
| CNAseg | We present a novel approach, called CNAseg, to identify CNAs from second-generation sequencing data. It uses depth of coverage to estimate copy number states and flowcell-to-flowcell variability in cancer and normal samples to control the false positive rate. | Structural variation | |||||
| CNAseg | We present a novel approach, called CNAseg, to identify CNAs from second-generation sequencing data. It uses depth of coverage to estimate copy number states and flowcell-to-flowcell variability in cancer and normal samples to control the false positive rate. | Structural variation | |||||
| CNB MetaGenomics tools | A number of tools and meta-tools developed at CNB/CSIC for the analysis of metagenomics data (some rely on QIIME). | Metagenomics Biodiversity Community analysis High-throughput sequencing |
Community Analysis | Bash Perl Python C |
EU-GPL | Linux Unix-like POSIX |
|
| CnD | Program to detect copy number variation in inbred mouse strains | Copy number estimation | Hidden Markov Model | D | GPL | ||
| CNVer | CNVer is a method for CNV detection that supplements the depth-of-coverage with paired-end mapping information, where matepairs mapping discordantly to the reference serve to indicate the presence of variation. CNVer combines this information within a unified computational framework called the donor graph, allowing it to better mitigate the sequencing biases that cause uneven local coverage. CNVer can also reconstruct the absolute copy counts of segments of the donor genome, and work with low coverage datasets. | Structural variation Copy number estimation |
Perl C++ |
||||
| CnvHMM | WashU copy number variant (CNV) detection algorithm for Illumina/Solexa data. | Structural variation | Linux | ||||
| CNVnator | CNV discovery and genotyping from read-depth analysis of personal genome sequencing | Copy number estimation Genotyping |
|||||
| CNVseq | Copy number estimation | Perl R |
|||||
| CompreheNGSive | compreheNGSive is an interactive visualization of the end results of the next-generation sequencing pipeline. | Next Generation Sequencing | Visualization | Python Qt |
LGPL | Mac OS X Linux |
|
| CoNAn-SNV | CoNAn-SNV is a probabilistic framework for the discovery of single nucleotide variants in WGSS data. This software explicitly integrates information about copy number state of different genomic segments into the inference of single nucleotide variants. | SNP discovery | C | ||||
| ConDeTri | ConDeTri is a content dependent read trimming software for Illumina/Solexa sequencing data | RNA-Seq DNA-Seq Genomics |
Trimming | Perl | |||
| ContEst | GATK tool to estimate amount of cross-individual contaminating sequence in a dataset | Sequencing Quality Control | Java | BSD | |||
| Contra | Copy number analysis for exome-sequencing / targeted-resequencing. Two methods of analysis available: Case vs Control, or Case vs Baseline. Function available for creating a baseline from multiple samples. | Next Generation Sequencing Cancer biology Genomics Copy number estimation |
Copy number analysis baseline (pseudo-control) creation |
Python R |
GPL v3 | Linux 64 Linux |
|
| Contrail | A Hadoop based genome assembler for assembling large genomes in the clouds | De-novo assembly | Assembly De Bruijn graph Hadoop |
||||
| CopySeq | CopySeq analyzes the depth-of-coverage of whole genome resequencing data to predict CNVs and to infer quantitative locus copy-number genotypes. | Structural variation Copy number estimation Genotyping Personal genomics |
Java R |
Mac OS X Linux |
|||
| Coral | Corrects sequencing errors in short read data via multiple alignments | Error correction | C++ | ||||
| CORAL (Contig Ordering Algorithm) | An algorithm has been developed to order fingerprinted clones within contigs. | Error correction | Java | ||||
| Cortex | Cortex is an efficient and low-memory software framework for analysis of genomes using sequence data. Cortex allows de novo assembly of variants without having to do a consensus assembly first. Also allows comparison of genomes without using consensus, and alignment of sequence data to a de Bruijn graph | Genomics | Assembly Variant Calling |
C | GPLv3 | ||
| CPTRA | Integrated transcriptome analysis from Sanger, 454, Solexa, SOLiD, etc reads | RNA-Seq Alignment RNA-Seq Quantitation |
Python | ||||
| CPTRA | Integrated transcriptome analysis from Sanger, 454, Solexa, SOLiD, etc reads | RNA-Seq Alignment RNA-Seq Quantitation |
Python | ||||
| CRAC | CRAC is a mapping software specialized for RNA-Seq data. It detects mutations, indels, splice or fusion junctions in each single read. | Mapping RNA Seq analysis RNA-Seq Alignment Alternative Splicing Fusion genes Fusion transcripts SNP discovery InDel discovery |
Mapping Read mapping Burrows-Wheeler FM-Index |
C++ | CeCILL | Linux Linux 64 Mac OS X |
|
| CRISP | Identifies rare and common variants in pooled sequencing data | SNP discovery | Pooled samples | Python | |||
| Crossbow | Crossbow is a cloud-computing software tool that combines the aligner BOWTIE and the SNP caller SOAPsnp. | SNP discovery | Mapping MapReduce Hadoop |
||||
| Crossbow | Crossbow is a cloud-computing software tool that combines the aligner BOWTIE and the SNP caller SOAPsnp. | SNP discovery | Mapping MapReduce Hadoop |
||||
| CUDA-EC | A scalable parallel algorithm for correcting sequencing errors in high-throughput short-read data so that error-free reads can be available before DNA fragment assembly. | Sequencing Quality Control GPU |
read error correction | C | |||
| Cufflinks | Cufflinks assembles transcripts and estimates their abundances in RNA-Seq samples. It accepts aligned RNA-Seq reads and assembles the alignments into a parsimonious set of transcripts. Cufflinks then estimates the relative abundances of these transcripts based on how many reads support each one. | RNA-Seq Alignment RNA-Seq Quantitation Differential Expression Alternative Splicing De novo transcriptome assembly RNA-Seq |
Assembly Differentially expressed gene identification Statistical testing |
Boost | |||
| CummeRbund | Allows for persistent storage, access, exploration, and manipulation of Cufflinks high-throughput sequencing data. In addition, provides numerous plotting functions for commonly used visualizations. | RNA-Seq Quantitation | Visualization | ||||
| Curtain | Curtain is a Java wrapper around next-generation assemblers such as Velvet which allows the incremental introduction of read-pair information into the assembly process. This enables the assembly of larger genomes than would otherwise be possible within existing memory constraints. | De-novo assembly | Assembly De Bruijn graph |
Apache License 2.0 | |||
| Cutadapt | remove adapter sequences from high-throughput sequencing data using alignment | Python C |
MIT | ||||
| DecGPU | Parallel and distributed error correction algorithm for high-throughput short reads. | De-novo assembly | Error correction GPU |
C++ | GPLv3 | Linux | |
| DeconSeq | DeconSeq can be used to automatically detect and efficiently remove any type of sequence contamination from metagenomic datasets, including human or other host sequences. The tool uses a modified version of the BWA-SW aligner and can be applied to longer-read datasets (150+bp read length). DeconSeq is available as both standalone and web-based versions. | Metagenomics Metatranscriptomics Genomics |
Contaminant filtering | Perl C |
GPLv3 | UNIX Mac OS X |
|
| DeFuse | deFuse is a software package for gene fusion discovery using RNA-Seq data. The software uses clusters of discordant paired end alignments to inform a split read alignment analysis for finding fusion boundaries. The software also employs a number of heuristic filters in an attempt to reduce the number of false positives and produces a fully annotated output for each predicted fusion | Fusion genes RNA-Seq Fusion transcripts |
|||||
| DEGseq | an R package to identify differentially expressed genes or isoforms for RNA-seq data from different samples | RNA-Seq Quantitation | Differentially expressed gene identification | R | |||
| DESeq | DESeq is an R package to analyse count data from high-throughput sequencing assays such as RNA-Seq and test for differential expression. | RNA-Seq Quantitation ChIP-Seq |
Statistical testing Sequencing Quality Control |
R | GPLv3 | UNIX Windows Mac OS X |
|
| DIAL | A computational pipeline for identifying single-base substitutions between two closely related genomes without the help of a reference genome. | SNP discovery Comparative genomics |
C Python |
MIT | Linux | ||
| DiBayes | Bayesian identification of SNPs in color space (SOLiD) data | SNP discovery | Colorspace | GPL | |||
| Diffreps | diffReps is developed to find different peaks in ChIP-seq. It scans the whole genome using a sliding window, performing millions of statistical tests and report the significant hits. diffReps takes into account the biological variations within a group of samples and uses that information to enhance the statistical power. Considering biological variation is of high importance, especiallly for in vivo brain tissues. | Epigenomics | ChIP seq | Multiple sample information used. | Perl | GPLv3 | Linux Windows Mac OS X |
| Dindel | Calls small indels from short-read sequence data | InDel discovery | Localized reassembly/realignment | ||||
| DNA Baser | Tool for manual and automatic sequence assembly, analysis, editing, sample processing, metadata integration, file format conversion and mutation detection. | Structural variation SNP discovery |
Assembly Assembly editing Sequence analysis |
Portable. Does not require installation. Can run from USB stick. Only 3MB. | Compiled | Commercial Freeware |
Windows |
| DNA Chromatogram Explorer | DNA Chromatogram Explorer is a Windows Explorer clone dedicated to DNA sequence analysis and manipulation. | Chromatogram management Chromatogram viewer Conversion |
Portable. Does not require installation. Can run from USB stick. Only 1MB. | Freeware | Windows | ||
| DNAA | DNAA (DNA Analysis) software for analysis of Next-Generation Sequencing data. | Structural variation SNP discovery DNA methylation |
Statistics Sequencing Quality Control Simulation |
GPL | |||
| DNAzip | A series of techniques that in combination reduces a single genome to a size small enough to be sent as an email attachment. | Data compression | C++ | ||||
| DrFAST | Fast mapper for dibase encoded data. | ||||||
| DSAP | Automated multiple-task web service designed to provide a total solution to analyzing deep-sequencing small RNA datasets generated by next-generation sequencing technology | Small RNA transcriptome MiRNA |
browser based | ||||
| DSGseq | This program aims to identify differentially spliced genes from two groups of RNA-seq samples. | RNA-Seq Differential Expression Alternative Splicing |
RNA-Seq analysis Differential expression Alternative Splicing Statistical testing |
C R |
Commercial Freeware |
Linux Windows Mac OS X |
|
| DSRC | Compression algorithm for genomic data in FASTQ format | Data compression | |||||
| E-miR | Perl tools for processing miRNA sequencing data | Small RNA transcriptome MiRNA |
|||||
| Ea-utils | FASTQ processing utilities | Trimming Sequencing Quality Control |
C++ | MIT | |||
| EagleView | EagleView is an information-rich genome assembler viewer with data integration capability. | Assembly visualization | |||||
| EagleView genome viewer | EagleView is an information-rich genome assembler viewer with data integration capability. | Viewer | |||||
| EBCall | EBCall is a software package for somatic mutation detection (including InDels). EBCall uses not only paired tumor/normal sequence data of a target sample, but also multiple non-paired normal reference samples for evaluating distribution of sequencing errors, which leads to an accurate mutaiton detection even in case of low sequencing depths and low allele frequencies. | ||||||
| ECHO | Reference-free short read error correction from diploid genomes, with explicit modeling of heterozygous sites. | SNP discovery InDel discovery |
Error correction | Python C++ |
BSD | ||
| EDENA | An assembler dedicated to process the millions of very short reads produced by the Illumina Genome Analyzer. | Assembly | |||||
| EdgeR | edgeR is an R/Bioconductor software package for statistical analysis of replicated count data. Methods are designed for assessing differential expression in comparative RNA-Seq experiments, but are generally applicable to count data from other genome-scale platforms (ChIP-Seq, MeDIP-Seq, Tag-Seq, SAGE-Seq etc). | RNA-Seq RNA-Seq Quantitation ChIP-Seq Gene Expression Analysis DNA methylation |
Statistical testing | R | LGPL | Windows Mac OS X UNIX |
|
| ELAND | Efficient Large-Scale Alignment of Nucleotide Databases. Whole genome alignments to a reference genome. Written by Illumina author Anthony J. Cox for the Solexa 1G machine. | Alignment | Commercial | ||||
| EMBF | Frequency-based, de novo short-read clustering method that organizes erroneous short sequences originating in a single abundant sequence into a tree structure; in this structure, each “child” sequence is considered to be stochastically derived from its more abundant “parent” sequence with one mutation through sequencing errors. | Mapping | |||||
| Epigenome | A bioinformatic pipeline that scores epigenetic alterations according to strength and significance and links them to potentially affected genes. | Epigenomics | Bisulfite mapping | R Python |
|||
| EpiGRAPH | EpiGRAPH enables biologists to analyze genome and epigenome datasets with powerful statistical and machine learning methods. In a typical workflow, the user uploads a set of genomic regions of interest (e.g. experimentally mapped enhancers, hotspots of epigenetic regulation or sites exhibiting disease-specific alterations), and EpiGRAPH searches a large database of (epi-) genomic attributes for significant overlap and correlation with the regions in the input dataset. Furthermore, EpiGRAPH can predict the status of genomic regions that were not included in the input dataset. | Epigenomics | Statistics Machine Learning |
browser based | |||
| ERANGE | ERANGE is a Python package for doing RNA-seq and ChIP-seq. | RNA-Seq Alignment RNA-Seq Quantitation ChIP-Seq Allele-specific transcription |
RNAseq analysis Chipseq analysis |
Python | |||
| ERDS | ERDS is a free, open-source software, designed for detection of copy number variants (CNVs) on human genomes from next generation sequence data. It uses paired Hidden Markov models (PHMM) based on the expected distribution of read depth of short reads and the presence of heterozygous sites. ERDS is NOT good for whole exome data. | Copy number estimation | Hidden Markov Model | ||||
| ERNE | Extended Randomized Numerical alignEr for accurate alignment of NGS reads. It can map bisulfite-treated reads. | Genomics Alignment Bisulfite Sequencing |
Mapping Bisulfite mapping |
Bisulfite sequencing sequence alignment |
C++ | GPL v3 | Linux Mac OS X Windows |
| Error Correction Evaluation Toolkit | Evaluation of error correction results | Sequence Quality Control | Python Perl |
POSIX | |||
| Est2assembly | Processes raw sequence data from Sanger or 454 sequencing into a hybrid de-novo assembly, annotates it and produces GMOD compatible output, including a SeqFeature database suitable for GBrowse. | RNA-Seq Alignment Genomics |
Assembly | ||||
| ESTcalc | Estimation of project costs for RNA-Seq study. | RNA-Seq | Cost estimation | Perl | |||
| EULER | EULER-SR is a program for de novo assembly of reads. Contrary to the overlap-layout approach, EULER-SR uses a de Bruijn graph to construct an assembly. The assembly of a genome corresponds to an Eulerian path in the de Bruijn graph. Long (possibly erroneous) reads, and mate-pairs are used to determine parts of the correct Eulerian traversal in the assembly. | Assembly De Bruijn graph |
C++ Perl |
Linux | |||
| ExomeCNV | Identifies copy number variation from targeted exome sequencing data | Targeted resequencing Copy number estimation |
R | ||||
| ExomeCopy | CNV detection from exome sequencing read depth | Exome and Whole genome variant detection Copy number estimation Exome analysis |
Hidden Markov Model | simultaneous normalization and segmentation | R | GPL 2.0+ | Linux Windows Mac OS X |
| ExomePicks | ExomePicks is a program that suggests individuals to be sequenced in a large pedigree. | ||||||
| Exonerate | Various forms of alignment (including Smith-Waterman-Gotoh) of DNA/protein against a reference. Authors are Guy St C Slater and Ewan Birney from EMBL. C for POSIX. | Alignment | C | GPL | Linux | ||
| FAAST | Flowspace Assisted Alignment Search Tool | Mapping | Linux | ||||
| FACS | Rapid and accurate classification of sequences as belonging or not belonging to a reference sequence. | Metagenomics | Bloom filters | Perl f |
GPLv2 | Linux | |
| FastQ Screen | FastQ Screen provides a simple way to screen a library of short reads against a set of reference libraries. It's most common use is as part of a QC pipeline to confirm that a library comes from the expected source, and to help identify any sources of contamination. | Genomics Transcriptomics |
Mapping Sequencing Quality Control |
Summarises the mapping of a library against a series of reference sequences | Perl | GPLv3 | Linux Mac OS X Windows |
| FastQC | FastQC aims to provide a simple way to do some quality control checks on raw sequence data coming from high throughput sequencing pipelines. | Sequencing Quality Control | Java | GPLv3 | UNIX Windows |
||
| FastQValidator | Checking that FastQ files are follows standards | Quality Control | Sequencing Quality Control | C++ | |||
| FDM | Detects differential transcription in RNA-Seq data | RNA-Seq Quantitation | |||||
| FeatureCounts | featureCounts is a read summarization function included in the Rsubread package. This function can very efficiently assign mapped reads to genomic features such as genes, exons and promoters. | Next Generation Sequencing | Read summarization | read summarization | R C |
GPLv3 | Linux 64 Mac OS X Mac OS X; x86 64 |
| Figaro | Figaro is a software tool for identifying and removing the vector from raw DNA sequence data without prior knowledge of the vector sequence. | Sequencing | K-mer analysis | AMOS | C++ Perl |
Artistic License | UNIX |
| Figaro | Figaro is a software tool for identifying and removing the vector from raw DNA sequence data without prior knowledge of the vector sequence. | Sequencing | K-mer analysis | AMOS | C++ Perl |
Artistic License | UNIX |
| Filter | Produces a filtered version of an sRNA dataset, controlled by several user-defined criteria, including sequence length, abundance, complexity, transfer and ribosomal RNA removal. | General bioinformatics (pipeline) | Filtering | multi-threading | Java | Custom Licence | Linux 64 Windows Mac OS X |
| FindPeaks 3.1 | Findpeaks was developed to perform analysis of ChIP-Seq experiments. | ChIP-Seq | Peak calling | GPLv3 | |||
| FindPeaks 4.0 (Vancouver Short Read Package) | The Vancouver Short Read Analysis Package (VSRAP) contains the FindPeaks application for Chip-Seq and RNA-Seq analysis, as well as utilities for SNP finding, working with aligned sequence files and a nascent database for storing SNPs across multiple libraries. | Genomics SNP discovery |
Peak calling Database Format conversion Alignment Analysis |
command line | Java | GPL | Linux Windows Mac OS X |
| FLASH | Identifies paired-end reads which overlap in the middle, converting them to single long reads | Assembly Read pre-processing |
combining forward and reverse reads | C | Open Source | Linux 64 | |
| Flexbar | flexible barcode and adapter processing for next-generation sequencing platforms | Next Generation Sequencing Sequence Quality Control Genomics |
Read pre-processing Sample Barcoding Adapter Removal (software) Trimming |
Paired read support separate barcode reads multi-threading |
C++ | GPLv3 | Linux Windows Mac OS X |
| Flower | Tool for reformatting SFF files into other formats or tab-delimited | Haskell | |||||
| FlowSim | Tool for simulating errors in 454 sequencing data | Error correction Simulation |
Haskell | ||||
| Flux | FluxCapacitor s a computer program to predict splice form abundancies from reads of an RNA-seq experiment. FluxSimulator can generate simulated data for testing RNA-seq pipelines | RNA-Seq | Simulation | ||||
| Forge | De novo assembly using a combination of next-generation and Sanger reads | Genomics De-novo assembly |
Assembly | ||||
| FragGeneScan | Application for finding (fragmented) genes in short reads | Metagenomics | C Perl |
GPL | |||
| FrameDP | Sensitive peptide detection on noisy matured sequences. A self-training integrative pipeline for predicting CDS in transcripts which can adapt itself to different levels of sequence qualities. | RNA-Seq | |||||
| FreClu | a frequency-based, de novo short-read clustering method that organizes erroneous short sequences originating in a single abundant sequence into a tree structure; in this structure, each “child” sequence is considered to be stochastically derived from its more abundant “parent” sequence with one mutation through sequencing errors. The root node is the most frequently observed sequence that represents all erroneous reads in the entire tree, allowing the alignment of the reliable representative read to the genome without the risk of mapping erroneous reads to false-positive positions. | RNA-Seq Alignment | Mapping | ||||
| Freebayes | Bayesian genetic variant detector (SNPs, indels, MNPs) | Genomics | MIT | ||||
| FREEC | A tool for control-free copy number alteration (CNA) detection using deep-sequencing data, particularly useful for cancer studies. | Copy number estimation | Linux Linux 64 Windows |
||||
| FusionAnalyser | FusionAnalyser is a new graphical, event-driven tool dedicated to the identification of driver fusion rearrangements in human cancer through the analysis of paired-end high-throughput transcriptome sequencing data. | High-throughput sequencing | Gene fusions discovery. | Advanced and user-friendly analysis of RNA-seq data for fusion discovery | C# | GPLv3 | Windows Linux |
| FusionCatcher | FusionCatcher searches for novel/known fusion genes, translocations, and chimeras in RNA-seq data (paired-end reads from Illumina NGS platforms like Solexa and HiSeq). | RNA-Seq Fusion finding |
Alignment | Python | GPL v3 | *NIX | |
| FusionHunter | Identifies gene fusions in RNA-Seq data | RNA-Seq Fusion transcripts |
Perl C |
Linux Linux 64 |
|||
| FusionMap | Detects fusion events in both single- and paired-end datasets from either RNA-Seq or gDNA-Seq studies and characterize fusion junctions at base-pair resolution. | Fusion genes Fusion transcripts |
Split-read | C# | Commercial Freeware |
Windows Linux Linux 64 |
|
| FusionSeq | Identifies fusion transcripts from paired end RNA-Seq data. | Fusion transcripts RNA-Seq Fusion genes |
Alignment Analysis | C | Creative Commons - Attribution; Non-commercial 2.5 | Mac OS X UNIX Linux |
|
| Fuzzypath | Assembler | Genomics | De Bruijn graph Assembly |
||||
| G-Mo.R-Seq | G-Mo.R-Se is a method aimed at using RNA-Seq short reads to build de novo gene models. | RNA-Seq Alignment | CeCILL | Linux | |||
| G-SQZ | Huffman coding-based sequencing-reads specific representation scheme that compresses data without altering the relative order. | Read storage Data compression |
C++ | ||||
| Galign | Identifies polymorphisms between sequence reads obtained using Illumina/Solexa technology and a reference genome | SNP discovery | Mapping | GPL | |||
| Gambit | A cross-platform GUI for sequence visualization and analysis. | Visualization | GPL 2.0+ Commercial |
||||
| GAMES | GAMES (Genomic Analysis of Mutations Extracted by Sequencing) is a tool for mining and prediction of functional effect of mutation. | SNP discovery SNP Annotation InDel discovery |
Perl | Linux | |||
| GASSST | Fast and accurate aligner for short an long reads | Alignment Mapping |
Gapped alignment short and long reads |
C++ | CeCILL | Linux | |
| GASV | Software for classification and comparison of structural variants measured via paired-end sequencing and/or array-CGH. | Structural variation | GPLv3 | ||||
| GATK | The Genome Analysis Toolkit (GATK) is a structured programming framework designed to enable rapid development of efficient and robust analysis tools for next-generation DNA sequencers. The GATK solves the data management challenge by separating data access patterns from analysis algorithms, using the functional programming philosophy of Map/Reduce | SNP discovery | MapReduce Programming Library Localized reassembly/realignment |
Java Python |
|||
| GBrowse | Genome Viewer | Visualization | Genome Viewer | Perl | Open Source | Linux Mac OS X Windows |
|
| GeeFu | Database tool for genomic assembly and feature data | Genomics | Assembly | Ruby | |||
| GEM | GEM is a java software tool to analyze transcription factor binding ChIP-Seq/ChIP-exo data. It predicts binding events, performs de novo motif discovery and use the motif to improve the binding event calling. It calls binding events right at (or very close to ) the motif positions, deconvolves closely spaced homotypic binding events and accurately discovers binding motifs. | ChIP-Seq Analysis Transcription Factor Binding Site identification Regulatory genomics epigenomics Genomics |
Peak calling Peak finding Peak detection Motif discovery K-mer analysis |
probabilistic mixture model motif prior multi-threading |
Java | Commercial Freeware |
Cross-Platform |
| GEM library | A set of very optimized tools for indexing/querying huge genomes/files. Provided so far: a very fast exact mapper, and an unconstrained split-mapper | Mapping Programming Library Colorspace |
C Python OCaml |
GPLv3 | |||
| GENE-Counter | GENE-counter is a computational pipeline for analyzing RNA-Sequencing (RNA-Seq) data for differential gene expression | RNA-Seq | Linux Mac OS X |
||||
| Geneious | Search, organize and analyze genomic and protein information of any size via desktop program that provides publication ready images to enhance the impact of your research. | Phylogenetics Sequence analysis De-novo assembly Whole Genome Resequencing Alignment Systems biology Comparative genomics SNP discovery InDel discovery Transcription Factor analysis Genomics Population genetics Homology Metagenomics Read alignment Structural variation RNA-Seq Motif analysis |
Alignment Assembly Assembly validation Annotation Multiple sequence alignment viewer Motif analysis Genetic variation annotation Basecaller Genome browser Sample Barcoding Database Mapping Visualization |
Java | Commercial Freeware |
Windows Mac OS X Linux Solaris |
|
| GeneProf | GeneProf is a web-based, graphical software suite and database resource for high-throughput-sequencing experiments (RNA-seq and ChIP-seq). | Functional Genomics RNA-Seq ChIP-Seq |
Workflow Quality Control Alignment Visualization Peak finding Differentially expressed gene identification |
User-friendly wizards tutorials examples very flexible reproducible transparent extensible API |
Java Javascript |
Commercial Freeware |
browser based |
| GeneTalk | GeneTalk, a web-based platform, can filter, reduce and prioritize human sequence variants from NGS data and assist in the time consuming and costly interpretation of personal variants in clinical context. It serves as an expert exchange platform for clinicians and scientists who are searching for information about specific sequence variants and connects them to share and exchange expertise on variants that are potentially disease-relevant. | Genetic variation annotation Sequence variation analysis Variant Calling Structural variation discovery Filtering Annotation Database Exome analysis Sequence analysis Variant Classification Viewer |
Easy-to-use point-and-click web interface data visualization data filtering Fast SNP annotation SNP calling Variant annotation and analysis variant counting |
Ruby Javascript |
Freemium | ||
| Genomatix Mining Station (GMS) | The Genomatix Mining Station (GMS) offers mapping of NGS reads onto genomes, transcriptomes and splice-junction libraries. It is a client-server based solution and can be controlled through an intuitive GUI or via command-line. It covers different tasks such as, as genomic positioning, SNP detection, splice analyses and genomic enrichments. | RNA-Seq SNP discovery ChIP-Seq |
Assembly Mapping SNP calling Genomic correlations |
Client-server based system allows for command-line and web-based access. Grid engine is used for job scheduling and mapping is run on multiple cores. Can be combined with a Genomatix Genome Analyzer (GGA) for a fully integrated NGS solution. | C++ Java Flash |
Commercial | Windows Mac OS X Linux |
| Genome Trax | Genome Trax™ enables you to identify human genome variations of functional significance by mapping your NGS data to known elements such as disease mutations and regulatory sites. | Structural variation Mutations Regulatory sites |
Commercial | ||||
| GenomeBrowse | A free genome browser for exploring sequencing pile-up and coverage data with numerous annotation tracks hosted on the cloud. | Sequence analysis DNA-Seq Alignment De novo sequencing Exome analysis Exome and whole genome variant detection Genetics Whole Genome Resequencing Next Generation Sequencing Genomics |
Alignment viewer Assembly visualization Visualization |
Windows Linux Mac OS X |
|||
| Genomedata | Genomedata is a format for efficient storage of multiple tracks of numeric data anchored to a genome. The format allows fast random access to hundreds of gigabytes of data, while retaining a small disk space footprint. | Storage | Signal | Python C |
GPL | Linux Mac OS X |
|
| GenomeJack | GenomeJack is a genome browser specialized in next-generation sequencing data. Advantages are intuitive interface and smooth drag'n drop response. | Genomics Personal genomics |
Visualization | Java | Freeware | Windows Mac OS X Linux |
|
| GenomeMapper | GenomeMapper is a short read mapping tool designed for accurate read alignments. It quickly aligns millions of reads either with ungapped or gapped alignments. It can be used to align against multiple genomes simulanteously or against a single reference. | Alignment Mapping |
|||||
| Genometa | Genometa is a Java based local bioinformatics program which allows rapid analysis of metagenomic short read datasets. Millions of short reads can be accurately analysed within minutes and visualised in the browser component. A large database of diverse bacteria and archaea has been constructed as a reference sequence. | Metagenomics Genomics |
Mapping Visualization |
mapping Data Visualisation |
Java | Linux Mac OS X Windows |
|
| GenomeTools | The GenomeTools genome analysis system is a free collection of bioinformatics tools for genome informatics.1.3.6 | Genomics | Integrated solution | C | BSD | POSIX Linux Mac OS X OpenBSD Windows (Cygwin) UNIX |
|
| GenomeView | GenomeView is a next-generation stand-alone genome browser and editor initiated in the BSB group at VIB and currently developed at Broad Institute. It provides interactive visualization of sequences, annotation, multiple alignments, syntenic mappings, short read alignments and more. Many standard file formats are supported and new functionality can be added using a plugin system. | Genomics Comparative genomics Comparative transcriptomics Transcriptomics Gene annotation retrieval Quality Control Sequencing Sequence analysis |
Visualization Alignment viewer Multiple sequence alignment viewer Viewer Genome browser |
Visualization of a multitude of genomics data | Java | GPL | platform-independent |
| GenomicTools | GenomicTools is a flexible computational platform for the analysis and manipulation of high-throughput sequencing data such as RNA-seq and ChIP-seq. A variety of mathematical operations between sets of genomic regions is implemented thereby enabling the prototyping of computational pipelines that can address a wide spectrum of tasks from preprocessing and quality control to meta-analyses. More specifically, the user can easily create average read profiles across transcriptional start sites or enhancer sites, quickly prototype customized peak discovery methods for ChIP-seq experiments, perform genome-wide statistical tests such as enrichment analyses, design controls via appropriate randomization schemes, among other applications. | Genomics ChIP-Seq RNA-Seq |
Genomic overlaps Peak detection Profiles Heatmaps |
create custom pipelines feature overlaps identify binding site peaks in ChIP-seq data create read profiles create read heatmaps |
C C++ |
GPL 2 | |
| GenoMiner | A proprietary NGS analysis solution. Powerful hardware comes with preinstalled software, organized in workflows. | Reference assembly De-novo assembly ChIP-Seq RNA-Seq |
Assembly Viewer Error correction Mutation detection Peak detection Expression profiling Sequence alignment |
GenoMiner provide workflows for Reference assembly De novo assembly ChIPSeq RNASeq and more. You upload your files at the beginning and you get the results at the end while you can choose from various tools to use for analysis. |
Java | Commercial | Linux |
| GenoViewer | A feature rich NGS assembly viewer/browser. | Viewer | large file loading multicontig handling SNP/InDel/Read Error display and search mutation table generation and export consensus sequence generation and export |
Java | Freeware | platform-independent | |
| GensearchNGS | A user friendly framework for re-sequencing in a diagnostics context: searching for mutations/variants, especially on well known genes. | Targeted resequencing | Alignment Alignment viewer Read Alignment Variant Prioritization Mutation detection Database Database submission preparation |
Plugin framework Cafe Variome submission |
Java | Commercial | UNIX Windows |
| GenVision | GenVision is a genomic visualization software package that is fully integrated with Lasergene and is designed to support easy generation of publication quality graphics and maps. | Genomics | Visualization | Commercial | Windows Mac OS X |
||
| Geoseq | Instead of mapping the reads to reference genomes or sequences, Geoseq maps a reference sequence against the sequencing data. It is web-based, and holds pre-computed data from public libraries. | Resequencing | Mapping | ||||
| GigaBayes | A short-read SNP and short-INDEL discovery program. | Genomics SNP discovery |
SNP calling | ||||
| GimmeMotifs | GimmeMotifs is a de novo motif prediction pipeline, especially suited for ChIP-seq datasets. It incorporates several existing motif prediction algorithms in an ensemble method to predict motifs and clusters these motifs using the WIC similarity scoring metric. | Transcription regulation ChIP-Seq Epigenomics |
Motif analysis | Python | MIT | Linux | |
| Girafe | The R/Bioconductor package girafe facilitates the functional exploration of alignments of sequence reads from next-generation sequencing data to a genome. It allows users to investigate the genomic intervals together with the aligned reads and to work with, visualise and export these intervals. | Alignment | R | ||||
| Gk arrays | Gk-arrays are a data structure to index the k-mers in a collection of reads. | Genomics Transcriptomics Metagenomics |
Assembly Error correction Mapping |
programming library | C++ | CeCILL-C license | Linux Linux 64 Mac OS X any |
| GMAP | GMAP (Genomic Mapping and Alignment Program) for mRNA and EST Sequences. | Alignment Mapping |
C Bourne shell |
UNIX | |||
| Gnumap | The Genomic Next-generation Universal MAPper (gnumap) is a program designed to accurately map sequence data obtained from next-generation sequencing machines (specifically that of Solexa/Illumina) back to a genome of any size. Currently, gnumap is designed to be used with the _int.txt data received from the Solexa/Illumina machine. | Mapping | C++ | ||||
| Goby framework | Goby is a next-gen data management framework designed to facilitate the implementation of efficient next-gen data analysis pipelines. | RNA-Seq | Programming Library Data compression |
Java | GPLv2 | ||
| Golden Helix | Golden Helix is a bioinformatic software provider and analytic service provider. The core of its business is about empowering scientists to discover more, discover it easier, and to come away with valid and reproducible bioinformatics results. The software, SNP & Variation Suite, is a stable platform for clever data manipulations, robust quality assurance, advanced statistical modeling, and compelling visual results in a genome browser environment of DNA Seq, Copy Number variation, SNP Chip, and RNA Seq data. | Epigenomics Genomics DNA-Seq SNP discovery Whole Genome Resequencing Analysis Copy number estimation Quality Control |
Quality Control Statistics Statistical testing Genome browser Annotation Filtering Collapsing Methods Variant Classification Variant Mapping |
Windows Linux Mac OS X |
|||
| Goseq | An R package to detect Gene Ontology (GO) categories and other categories of genes (such as KEGG pathways) that are over/under represented in an RNA-seq data. | RNA-Seq Quantitation | Gene Set Testing | R | LGPL | UNIX Windows |
|
| Gowinda | Gowinda: unbiased analysis of gene set enrichment for Genome Wide Association Studies | Genomics Genome Wide Association Studies Population genetics Population Genomics High-throughput sequencing |
Gene set enrichment Gene ontology Genome wide association studys |
Multicore | Java | Mozilla Public License | Mac OS X Linux Windows |
| GPS | GPS is a high spatial resolution peak detection algorithm for ChIP-Seq data. | Genomics ChIP-Seq Transcription Factor Binding Site identification Regulatory genomics epigenomics |
Protein Binding Peak Detection | multi-threading | Java | Commercial Freeware |
Cross-Platform |
| GPSeq | Analyze RNA-seq data to estimate gene and exon expression, identify differentially expressed genes, and differentially spliced exons | RNA-Seq Quantitation | R C |
||||
| GRS | Reference-based data compression for storage of resequencing data | Data compression | sequence compression | C Bourne shell |
Commercial Freeware |
Linux Linux 64 |
|
| GSNAP | GSNAP can align both single-end and paired-end reads as short as 14 nt and of arbitrarily long length. It can detect short- and long-distance splicing, including interchromosomal splicing, in individual reads using probabilistic models or a database of known splice sites. Our program also permits SNP-tolerant alignment to a reference space of all possible combinations of major and minor alleles, and can align reads from bisulfite treated DNA for the study of methylation state. | RNA-Seq Alignment DNA methylation |
Mapping Bisulfite mapping |
C Perl |
|||
| Hairpin Annotation | Generates a secondary structure from an RNA sequence and highlights regions of interest using RNAplot | General bioinformatics (pipeline) | Java | Custom Licence | Linux 64 Windows Mac OS X |
||
| Haplowser | Haplowser: comparative haplotype browser for personal genome and metagenome | Visualization Haplotype reconstruction |
Java | GPL | |||
| HawkEye | An interactive visual analytics tool for genome assemblies. | Assembly visualization | Assembly visualization | C++ | Artistic License | Linux Mac OS X |
|
| HeliSphere | Open-source LINUX software package intended for use in analyzing data produced by the HeliScope Single Molecule Sequencer. | Genomics Whole Genome Resequencing RNA-Seq SNP discovery |
Mapping | Freeware | Linux | ||
| HI | Program for haplotype reconstruction from paired-end reads. | Haplotype reconstruction | Java | ||||
| Hicup | A mapping pipeline for HiC interaction data. Performs independent mapping on each end of the interaction pair and removes commonly found artefacts. | Epigenomics | Mapping | Perl | GPLv3 | UNIX Linux Mac OS X |
|
| HiTEC | An algorithm which provides a highly accurate, robust, and fully automated method to correct reads produced by high-throughput sequencing methods. | Error correction | C++ | GPLv3 | Linux | ||
| HMMSplicer | Splice junction discovery in RNA-Seq data | RNA-Seq Alignment | Python | ||||
| HPeak | Hidden Markov model (HMM)-based Peak-finding algorithm for analyzing ChIP-Seq data to identify protein-interacting genomic regions. | ChIP-Seq | Hidden Markov Model | ||||
| HTSeq | Python framework to process and analyse high-throughput sequencing (HTS) data | Programming Library | Python | GPLv3 | |||
| Hybrid-SHREC | Improves sequence data quality using information from multiple platforms. | Error correction | Java | ||||
| IBD2 | Our algorithm uses a non-homogeneous hidden Markov model (HMM) that employs local recombination rates to identify chromosomal regions that are identical by descent (IBD=2) in children of consanguineous or non-consanguineous parents solely based on genotype data of siblings derived from high-throughput sequencing platforms. | Targeted resequencing | R Java |
||||
| Ibis | Ibis (Improved base identification system), is an accurate, fast and easy-to-use base caller for the Illumina sequencing system, which significantly reduces the error rate and increases the output of usable reads. Ibis is faster and makes fewer assumptions about chemistry and technology | Sequencing | Basecaller | Statistical learning of base calling parameters and calibrated quality scoring | Python C C++ |
Non-commercial | Linux Windows (Cygwin) |
| ICORN | Iteratively aligns deep coverage of short sequencing reads to correct errors in reference genome sequences and evaluate their accuracy. | Assembly Sequencing Quality Control |
|||||
| IDBA | IDBA (Iterative De Bruijn graph short read Assembler) is a short read assembler based on iterative De Bruijn graph. It is developed under 64-bit Linux, but should be suitable for all unix-like system | De-novo assembly | Assembly | POSIX Linux Linux 64 |
|||
| IGV | The Integrative Genomics Viewer (IGV) is a high-performance visualization tool for interactive exploration of large, integrated datasets. It supports a wide variety of data types and format, including short-read alignments in the SAM/BAM format. Data can be viewed from local files or over the web via http. | Genomics | Visualization | Java | LGPL | Windows Mac OS X Linux |
|
| Illuminator | Software for machines running Windows to identify variants in Illumina short read data. | SNP discovery InDel discovery |
|||||
| Inchworm | Employs the Kmer graph method to reconstruct (in many cases full-length) transcripts from Illumina RNA-Seq (preferrably strand-specific) reads. | RNA-Seq De novo transcriptome assembly |
|||||
| InGAP | inGAP is an integrated platform for next-generation sequencing project, the core function of which is to detect SNPs and indels using a Bayesian algorithm. | SNP discovery | Mapping Assembly visualization |
||||
| Ingenuity Variant Analysis | Ingenuity Variant Analysis is a web application that helps researchers studying human disease to identify causal variants from human resequencing data in just minutes. Ingenuity Variant Analysis combines analytical tools and integrated content to help you rapidly identify and prioritize variants by drilling down to a small, targeted subset of compelling variants based both upon published biological evidence and your own knowledge of disease biology. With Variant Analysis, you can interrogate your variants from multiple biological perspectives, explore different biological hypotheses, and identify the most promising variants for follow-up. | Genomics Genetics Next Generation Sequencing Exome and Whole genome variant detection Whole Genome Resequencing Analysis Exome analysis Causal Variant Detection Targeted Sequencing |
Biological Interpretation and Analysis of DNA Sequence Data | ||||
| Integrated Genome Browser | Visualization software for next-generation genomics | Genomics | Visualization | Java | Open Source | platform-independent | |
| IOmics | iOmics is a cloud based workflow analysis framework for managing, analyzing and visualizing NGS data. | Genomics Transcriptomics Epigenomics RNA-Seq Exome and Whole genome variant detection |
Genome Alignment Assembly Ab-inito gene prediction Genetic variation annotation Exome analysis ChIP seq MiRNA analysis (Ref and Ab-initio) |
Commercial | cloud | ||
| IQSeq | Integrated Isoform Quantification Analysis based on A Partial Sampling Framework | RNA-Seq Quantitation Alternative Splicing |
C++ | ||||
| Isas | Fast aligner for color and base space short read data. | Alignment Colorspace |
Linux | ||||
| IsoEM | Expectation maximization algorithm for estimating alternative splicing isoform frequencies | Alternative Splicing | Expectation Maximization | Java | |||
| ISSAKE | Short Sequence Assembly by K-mer search and 3' read Extension, Immunology version (iSSAKE) | Metagenomics | Assembly | Perl Python |
GPLv2 | ||
| JBrowse | Slick, speedy genome browser with a responsive and dynamic AJAX interface for visualization of genome data. Being developed by the GMOD project as a successor to GBrowse. | Visualization | Perl Javascript |
Open Source | browser based | ||
| Jellyfish | Fast, memory-efficient k-mer counting algorithm | C++ | GPLv3 | Linux 64 Mac OS X |
|||
| JointSLM | Copy number estimation from read depth information | Copy number estimation | R | ||||
| KARMA | K-tuple Alignment with Rapid Matching Algorithm | Bisulfite Sequencing | Mapping | ||||
| Kismeth | Web-based tool for bisulfite sequencing analysis | DNA methylation Epigenomics |
Bisulfite mapping | ||||
| Kissnp | kisSnp compares two sets of NGS raw reads, detecting Single Nucleotide Polymorphism occurring between the two sets. The two sets typically come from the sequencing of two individuals from the same species or from closely related species. | Comparative genomics Comparative transcriptomics Gene annotation retrieval SNP discovery InDel discovery |
Micro assembly De Bruijn graph |
SNP calling | C | CeCILL | Linux |
| KNIME | Software for organizing bioinformatic workflows | Workflow | GPLv3 | Windows Mac OS X Linux |
|||
| Knime4Bio | custom nodes for the interpretation of Next Generation Sequencing data with KNIME. | Genomics Gene annotation retrieval Mutations and regulatory sites |
KNIME | Java | GPLv3 | any | |
| Krona | Krona creates interactive HTML5 charts of hierarchical data (such as taxonomic abundance in a metagenome). | Metagenomics | Visualization | Interactive Animation HTML5 canvas graphics |
Javascript Perl |
Linux UNIX Mac OS X |
|
| Lab7 | Data workflow management platform to streamline NGS analyses | Genomics | Workflow | Python Javascript |
Commercial | Mac OS X Linux |
|
| Lasergene | Lasergene is a comprehensive DNA and protein sequence analysis software suite comprised of seven applications which include functions ranging from sequence assembly and SNP detection, to automated virtual cloning and primer design. | Alignment De novo sequencing De-novo assembly Genomics InDel discovery Integrated solution Mapping Phylogenetics Protein structure analysis Read alignment SNP discovery Sequence analysis Transcription Factor Binding Site identification |
Alignment Alignment Analysis Annotation Assembly Chromatogram viewer Colorspace Sequence analysis Integrated Solution Mapping PCR Primer Design Paired End Scaffolding |
Commercial | Windows Mac OS X |
||
| LAST | Short read alignment program incorporating quality scores | Genomics Comparative genomics |
Alignment | C++ | GPLv3 | ||
| LASTZ | A tool for (1) aligning two DNA sequences, and (2) inferring appropriate scoring parameters automatically | Genomics | Mapping Alignment |
Mac OS X Linux |
|||
| LobSTR | lobSTR is an alignment and genotyping tool for profiling short tandem repeats from next generation sequencing data | Sequencing | Profiling short tandem repeats from short reads | Fast Scalable sequence alignment Gapped alignment |
C++ R Python |
Freeware | UNIX |
| LOCAS | LOCAS low-coverage short-read assembler | Assembly | C++ | Linux | |||
| LookSeq | AJAX-based browser for deep sequencing data | Assembly visualization | |||||
| MACS | Model-based Analysis of ChIP-Seq data. | ChIP-Seq | Peak calling | Python | Artistic License | platform-independent | |
| MagicViewer | Large-scale short reads and sequencing depth visualization. | De novo sequencing Targeted resequencing |
Visualization Genetic variation annotation |
Java | platform-independent | ||
| MapDamage | Identifies and quantifies DNA damage patterns in ancient DNA | Ancient DNA | Quality Control Statistical Modelling |
Python/R | Linux Mac OS X |
||
| MapNext | MapNext provides four mainly analysis: (i) unspliced alignment and clustering of reads, (ii) spliced alignment of transcriptomic reads, (iii) SNP detection and calculation of SNP frequency from population sequences, and (iv) storage of result data into database to make it available for more flexible query and further analyses. | SNP discovery RNA-Seq Alignment |
Alignment | C++ Perl |
|||
| Mapsembler | Mapsembler is a targeted assembly software. It takes as input a set of NGS raw reads and a set of input sequences (starters). It first determines if each starter is read-coherent, e.g. whether reads confirm the presence of each starter in the original sequence. Then for each read-coherent starter, Mapsembler outputs its sequence neighborhood as a linear sequence or as a graph, depending on the user choice. | Metagenomics Transcriptomics DNA-Seq RNA-Seq Quantitation Targeted assembly |
Assembly Micro assembly Mapping |
De novo assembly Identify Novel Exons Remove contaminants Detect enzymes in metagenomics NGS reads |
C | CeCILL | Linux |
| MapSplice | We introduce a second generation splice detection algorithm, MapSplice, whose focus is high sensitivity and specificity in the detection of splices as well as CPU and memory efficiency. MapSplice can be applied to both short (<75 bp) and long reads (75 bp). MapSplice is not dependent on splice site features or intron length, consequently it can detect novel canonical as well as non-canonical splices. MapSplice leverages the quality and diversity of read alignments of a given splice to increase accuracy. | RNA-Seq Alignment | Mapping | C++ Python |
Linux | ||
| MapView | Visualization of short reads alignment on desktop computer | Visualization | Linux Windows |
||||
| MAQ | Mapping and Assembly with Qualities (renamed from MAPASS2). Particularly designed for Illumina-Solexa 1G Genetic Analyzer, and has preliminary functions to handle ABI SOLiD data. | Genomics SNP discovery |
Mapping | C++ Perl |
GPL | ||
| MAQGene | Complete pipeline for mutant discovery, with web front end | SNP discovery | Mapping Integrated Solution |
||||
| MARGARITA | SNP discovery and genotyping from low-coverage sequencing data | SNP discovery Genotyping |
|||||
| Mason | A fast, feature-rich and hackable read simulator for the simulation of NGS and Sanger data. | Genomics | Simulation Assembly Mapping |
Empirical or simple model for position dependent errors can write out sample position and extensive information about the sampled infix haplotype simulation through mutation of reference sequence. |
C++ | GPLv3 | UNIX Windows |
| Mauve | Mauve Genome Alignment software, for comparing two or more draft or finished genomes | Comparative genomics Genomics Transcriptomics |
Genome Alignment Alignment Visualization Assembly QC |
C++ Java |
GPL | Mac OS X Windows Linux |
|
| MAXIMUS | Hybrid reference and de novo assembly pipeline | Genomics | Hybrid assembly | ||||
| MAYDAY | Extensible platform for visual data exploration and interactive analysis and provides many methods for dissecting complex transcriptome datasets. | RNA-Seq | Visualization | ||||
| MEGAN | Metagenome Analysis Software - MEGAN (âMEtaGenome ANalyzerâ) is a new computer program that allows laptop analysis of large metagenomic datasets. In a preprocessing step, the set of DNA reads (or contigs) is compared against databases of known sequences using BLAST or another comparison tool. MEGAN can then be used to compute and interactively explore the taxonomical content of the dataset, employing the NCBI taxonomy to summarize and order the results. | Metagenomics | metagenomic analysis functional classification |
Commercial Freeware |
|||
| Megraft | Megraft is a software tool to graft ribosomal small subunit (16S/18S) fragments from metagenomes onto full-length SSU sequences, enabling accurate diversity estimates from fragmentary and non-overlapping sequence data. | Metagenomics Phylogenetics Sequence analysis Community analysis Rarefaction |
Hidden Markov Model Sequence analysis |
Perl | GPLv3 | Linux UNIX Mac OS X |
|
| Meraculous | De novo genome assembler from short reads | Assembly | De novo assembly scaffolding |
Perl C |
|||
| METAGENassist | User-friendly, web-based analytical pipeline for comparative metagenomic studies. Input can be derived from either 16S rRNA data or NextGen shotgun sequencing. | Metagenomics | Visualization Statistics Clustering Machine Learning |
Easy-to-use point-and-click web interface; data visualization; publication-quality graphs and charts; wide variety of statistical methods; taxon-to-phenotype mapping; data filtering and normalization; supports many common input formats | |||
| MetaSim | The software can be used to generate collections of synthetic reads. | Metagenomics Genomics |
Simulation Assembly Mapping |
Java | Commercial Freeware |
||
| Metaxa | Metaxa uses Hidden Markov Models to identify, extract and classify small-subunit (SSU) rRNA sequences (12S/16S/18S) of bacterial, archaeal, eukaryotic, chloroplast and mitochondrial origin in metagenomes and other large sequence sets | Metagenomics Phylogenetics Sequence analysis Community analysis |
Hidden Markov Model Sequence analysis |
Perl | GPLv3 | Linux UNIX Mac OS X |
|
| MethMarker | MethMarker facilitates the design of DNA methylation assays for COBRA, bisulfite SNuPE, bisulfite pyrosequencing, MethyLight and MSP. It also implements a systematic workflow for design, optimization and (computational) validation of DNA methylation biomarkers. This workflow starts from a preselected differentially methylated region (DMR) and results in an optimized DNA methylation assay that is ready to be tested in a large-scale clinical trial. | Epigenomics DNA methylation |
Java | Windows Linux Mac OS X Solaris |
|||
| MethylCoder | Pipeline for fast, simple processing of BiSulfite-treated reads into methylation data. Includes scripts for analysis and visualization. In addition to a binary output, the direct output of methylcoder is a text file that indicates per-nucleotide methylation context (CG/CHG/CHH) and methylation levels (both coverage and C-T conversions) | Genomics Sequencing DNA methylation Epigenomics |
Mapping Bisulfite mapping |
Python C |
BSD | Linux Linux 64 Mac OS X |
|
| MetMap | Produces corrected site-specific methylation states from MethylSeq experiments and annotates unmethylated islands across the genome. | DNA methylation | |||||
| MeV | Visualization of genomic data, Differential Gene Expression based on DEGseq, DESeq and edgeR | RNA-Seq | Clustering Visualization Classification Differentially expressed gene identification |
Artistic License | |||
| MG-RAST | MG-RAST is a fully-automated service for annotating metagenome samples. | Metagenomics Phylogenetics Metabolic reconstruction |
Annotation | ||||
| MicroRazerS | MicroRazerS is a tool optimized for mapping short RNAs onto a reference genome. | Mapping | C++ | Linux | |||
| Microsoft Biology Foundation | C#/.NET library for biological applications. | Programming Library | C# | ||||
| MICSA | Combines positional information with information on motif occurrences to better predict binding sites of transcription factors (TFs) | ChIP-Seq | Motif analysis | ||||
| Minia | De novo assembly of human genomes on a desktop computer | De novo assembly | Assembly | Memory efficient and fast | C++ | CeCILL | Linux Mac OS X |
| MIP Scaffolder | MIP Scaffolder is a program for scaffolding contigs produced by fragment assemblers using mate pair data. | Scaffolding | C++ Perl |
Linux | |||
| MIRA | MIRA 3 - Whole Genome Shotgun and EST Sequence Assembler | De-novo assembly SNP discovery RNA-Seq Alignment |
Smith-Waterman Graph reduction Learning algorithm Assembly Mapping K-mer analysis |
C++ | GPL | Linux Mac OS X UNIX |
|
| MiRanalyzer | Web-server for identifying and analyzing miRNA in next-gen sequencing experiments | MiRNA | Annotation of micro RNA differential expression |
Java Perl |
browser based | ||
| MiRCat | Predicts mature miRNAs and their precursors from an sRNA dataset and a genome. | General bioinformatics (pipeline) | MiRNA Prediction | Detection and prediction of known or novel miRNAs secondary structure generation |
Java | Custom Licence | Linux 64 Windows Mac OS X |
| MiRDeep | Discovering known and novel miRNAs from deep sequencing data | MiRNA | Perl | ||||
| MiRNAkey | A software pipeline for the analysis of microRNA Deep Sequencing data | MiRNA | Java Perl |
Linux Mac OS X |
|||
| MiRProf | Determines normalised expression levels of sRNAs matching known miRNAs in miRBase. | General bioinformatics (pipeline) | MiRNA profiling | Java | Custom Licence | Linux 64 Windows Mac OS X |
|
| MirTools | Web server for microRNA profiling and discovery based on high-throughput sequencing | Small RNA transcriptome MiRNA |
Perl PHP |
||||
| MISO | An alternative to Cufflinks, MISO (Mixture-of-Isoforms) is a probabilistic framework that quantitates the expression level of alternatively spliced genes. | RNA-Seq Quantitation RNA-Seq |
|||||
| Mlgt | Processing and analysis of high throughput, long-read (e.g. Roche 454) sequences generated from multiple loci and multiple biological samples. Sequences are assigned to their locus and sample of origin, aligned and trimmed. Where possible, genotypes are called and variants mapped to known alleles. | Genotyping Targeted resequencing Resequencing |
Sequence analysis Error correction Filtering Sample Barcoding Pooled samples Read Alignment |
Sequence assignment sequence alignment allignment error correction variant counting genotype calling allele-matching |
R | GPL >=2 | Windows UNIX Mac OS X |
| MMSEQ | Pipeline and methodology for simultaneously estimating isoform expression and allelic imbalance in diploid organisms using RNA-seq data. | Allele-specific transcription | C++ | Mac OS X Linux 64 |
|||
| MochiView | Hybrid genome browser and motif visualization/analysis/management desktop software. | Genomics ChIP-Seq ChIP-on-chip RNA-Seq Motif analysis |
Genome browser Motif analysis |
Desktop hybrid genome browser and motif visualization/analysis software | Java | Linux Mac OS X Windows |
|
| MoDIL | Program to detect small indels in next generation sequencing data | Genomics InDel discovery |
Python | ||||
| MOM | Short-read mapping | Genomics | Mapping | ||||
| MOSAIK | Reference guided aligner/assembler. | Assembly Colorspace |
C++ | Commercial GPLv2 |
Windows Linux Mac OS X |
||
| MPscan | MPscan (multi-pattern scan) is a program for mapping short reads (<30bp) exactly on a set of reference sequences (eg, a genome) without indexing the reference. MPscan performs only exact mapping (no substitution, nor indels), is fast (optimal complexity), and easy to use. | Genomics Transcriptomics |
Mapping | C++ | Linux Mac OS X |
||
| MrCaNaVaR | mrCaNaVaR is a copy number caller that analyzes the next-generation sequence mapping read depth to discover large segmental duplications and deletions. It also has the capability of predicting absolute copy numbers of genomic intervals. | Genomics Personal genomics Copy number estimation |
Read depth analysis | C | Commercial Freeware |
POSIX | |
| MrFAST | mrFAST is designed to map short reads generated with the Illumina platform to reference genome assemblies; in a fast and memory-efficient manner. | Genomics | Read Alignment Mapping |
C | BSD | UNIX | |
| MrsFAST | mrsFAST is a micro-read substitution-only Fast Alignment Search Tool. mrsFAST is a cache-oblivous short read mapper that optimizes cache usage to get higher performance. | Genomics | Read Alignment Mapping |
C | BSD | UNIX | |
| MTR | Metagenomics software for clustering at multiple ranks. | Metagenomics | C++ Matlab |
||||
| MU2A | Genomic variant annotation tool | SNP Annotation | Java | Apache License 2.0 | Windows Linux Mac OS X |
||
| MUMmer | MUMmer is a modular system for the rapid whole genome alignment of finished or draft sequence. Basically it is a ultra-fast alignment of large-scale DNA and protein sequences | Genomics Transcriptomics |
Alignment | Artistic License | Linux | ||
| MUMmerGPU | MUMmerGPU is a low cost, ultra-fast sequence alignment program designed to handle the increasing volume of data produced by HTS. | Genomics Transcriptomics |
Alignment GPU |
||||
| MuMRescueLite | Probabilistically reincorporates multi-mapping tags into mapped short read data. | Genomics ChIP-Seq |
Mapping | Python | MIT | ||
| MuSICA 2 | Assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence ~800 human full-length cDNA clones. | Clone verification | Assembly | Perl | |||
| Mutascope | Mutascope is a software suite designed to analyze data from high throughput sequencing of PCR amplicons, with an emphasis on normal-tumor comparison for the accurate and sensitive identification of low prevalence mutations. | Cancer biology | Somatic variant calling Analysis Pipeline |
Perl | UNIX | ||
| MuTect | MuTect is a method developed at the Broad Institute for the reliable and accurate identification of somatic point mutations in next generation sequencing data of cancer genomes. | SNP calling | |||||
| Myrialign | Software to align short reads produced by a short read genome sequencer to a reference genome. Alignments can contain any number of SNPs, insertions and deletions, up to a user specified cutoff. Myrialign can use a Cell Broadband Engine processor to accelerate alignments if available, for example on a PlayStation 3 running GNU/Linux.
Myrialign performs brute force alignment using a variant on the "bitap" algorithm that aligns several thousand reads to a reference in parallel. It uses bit-parallelism, multiple processors, and Cell SPUs if available. Unlike other reference genome alignment software, heuristics and hashtable lookups are not used. Myrialign will find alignments with any number of errors up to a user specified cutoff. The emphasis is on doing a 100% accurate search as fast as is possible. |
Mapping Alignment GPU |
|||||
| Myrna | Myrna is a cloud computing tool for calculating differential gene expression in large RNA-seq datasets. | RNA-Seq Quantitation RNA-Seq Alignment |
Hadoop MapReduce |
||||
| Mzip | Reference-based sequence data compression tool | Data compression | |||||
| Nesoni | Nesoni is a high-throughput sequencing data analysis toolset. | RNA-Seq Alignment SNP discovery Phylogenetics |
Alignment | largely for bacterial genomes | Python | ||
| Newbler | The assembly/mapping program developed by 454 Life Sciences for of 454 data | De-novo assembly | Assembly Mapping |
C++ | Unknown | Linux 64 | |
| Nexalign | Nexalign is a program to align millions of short reads from next-generation sequencing data sets to reference genomes | Mapping | C++ R |
GPL | UNIX | ||
| NextGen Utility Scripts | A collection of links to scripts available for working with data generated by new sequencing technologies. | A collection of many different scripts | |||||
| NextGENe | de novo and reference assembly of Roche/454, Illumina and SOLiD data. Uses a novel Condensation Assembly Tool approach where reads are joined via "anchors" into mini-contigs before assembly which reduces sequencing errors. Requires Win or MacOS. | De novo sequencing Metagenomics SNP discovery InDel discovery Targeted resequencing |
Unique condensation tool Data Visualisation very flexible |
C++ | Commercial | Windows | |
| Ngs backbone | ngs_backbone is a bioinformatic application created to work on sequence analysis by using NGS (Next Generation Sequencing) and sanger sequences. It is capable of cleaning reads, do de novo assembly or mapping against a reference and annotate SNPs, SSRs, ORFs, GO terms and sequence descriptions. | SNP discovery Genomics |
Mapping Assembly |
AGPL | UNIX | ||
| NGS-DesignTools | Tools to assist in designing deep sequencing experiments for haplotype reconstruction and structural variant breakpoint detection | Structural variation RNA-Seq Quantitation |
Haplotype reconstruction Simulation |
||||
| Ngs-pipeline | Complete solution for human re-sequencing projects | Personal genomics Regulatory genomics epigenomics SNP discovery Structural variation discovery Regulatory element annotation InDel discovery |
Mapping | Perl | GPLv3 | Linux | |
| NGSUtils | NGSUtils is a suite of software tools for working with next-generation sequencing datasets | Genomics Transcriptomics |
Filtering QC Read pre-processing Variant Calling Format conversion |
Python | GPL | Linux Mac OS X |
|
| NGSView | High-throughput sequencing technologies introduce novel demands on tools available for data analysis. We have developed NGSView, a generally applicable, flexible and extensible next-generation sequence alignment editor. The software allows for visualization and manipulation of millions of sequences simultaneously on a desktop computer, through a graphical interface. NGSView is available under an open source license and can be extended through a well documented API. | Genomics | Assembly visualization | ||||
| NOISeq | Next Generation Sequencing (NGS) technologies are increasingly being used for gene expression pro�filing as a replacement for microarrays. The expression level given by these technologies is the number of reads in the library mapping to a given feature (gene, exon, transcript, etc.), i.e., the read counts. Most of the statistical methods for assessment of differential expression using count data rely on parametric assumptions about the distribution of the counts (Poisson, Negative Binomial, …). Moreover, many of them need replicates to work and tend to have problems to evaluate differential expression in features with low counts.
NOISeq is a non-parametric approach for the identification of differentially expressed genes from count data. NOISeq empirically models the noise distribution of count changes by contrasting fold-change differences (M) and absolute expression differences (D) for all the features in samples within the same condition. This reference distribution is then used to assess whether the M-D values computed between two conditions for a given gene is likely to be part of the noise or represent a true differential expression. The are two variants of the method: NOISeq-real uses replicates, when available, to compute the noise distribution and, NOISeq-sim simulates them in absence of replication. It should be noted that the NOISeq-sim simulation procedure assimilates to technical replication and does not reproduce biological variability, which is necessary for population inferential analysis. |
Differential Expression | |||||
| NovelSeq | A computational framework to discover the content and location of long novel sequence insertions using paired-end sequencing data | Structural variation InDel discovery |
Mapping Assembly Variant Calling |
C | BSD | UNIX | |
| Novocraft | Novoalign is a program for mapping short reads from the Illumina/SOLiD sequencing platform(s) to a reference genome. | Genomics Whole Genome Resequencing RNA-Seq Alignment ChIP-Seq MiRNA |
Mapping | Bisulfite sequencing Mate-pair/jumping libraries parallel execution insertions/deletions SAM format output paired-end colourspace MPI |
C++ | Commercial Freeware |
Mac OS X Linux 64 |
| NPS | Identify nucleosome positions given histone-modification ChIP-seq or nucleosome sequencing at the nucleosome level. | Epigenomics ChIP-Seq |
Python | ||||
| NucleR | nucleR is a R/Bioconductor package for working with tiling arrays and next generation sequencing. It uses a novel aproach in this field which comprises a deep profile cleaning using Fourier Transform and peak scoring for a quick and flexible nucleosome calling | ChIP-on-chip ChIP-Seq Nucleosome Positioning Epigenomics |
Annotation Peak calling Protein Binding Peak Detection Peak detection Peak finding Programming Library |
Multicore Integrated solution |
R | LGPL3 | Cross-Platform |
| Oases | De novo transcriptome assembler for very short reads | De novo transcriptome assembly | supports strand specific and paired-end RNA-seq data sets | C | GPLv3 | ||
| OLego | OLego is a program specifically designed for de novo spliced mapping of mRNA-seq reads. OLego adopts a seeding and extension scheme, and does not rely on a separate external mapper. It achieves high sensitivity of junction detection by using very small seeds (12-14 nt), efficiently mapped using Burrows-Wheeler transform (BWT) and FM-index. This also makes it particularly sensitive for discovering small exons. It is implemented in C++ with full support of multiple threading, to allow fast processing of large-scale data. | Genomics RNA-Seq RNA-Seq Alignment |
Mapping Alignment |
capable of using very small seeds for splice mapping but still fast and accurate |
C++ | GPLv3 | Linux Linux 64 Mac OS X |
| Omixon Variant Toolkit | Omixon Target Standard, Target HLA and Target Pro are designed to help clinical, diagnostic and research labs to efficiently get the maximum accuracy and precision from their targeted NGS data. | Comparative genomics Mapping Sequence analysis Read alignment InDel discovery SNP discovery |
Alignment Assembly Mapping Colorspace Basespace |
easy to use parameters full documentation also a plugin available in CLCbio and Geneious |
Commercial Freeware |
interoperable | |
| Optimus Primer | Automated primer design for large-scale resequencing by second generation sequencing | Resequencing | PCR Primer Design | ||||
| PacBio conversion tools | Tools to convert from PacBio HDF5 format to other commonly used formats & libraries to read HDF5 from Java & R | Programming Library Conversion |
Java R Python |
||||
| PaCGeE | PaCGeE (Parallel Computational Genomics Engine) is a suite of HPC accelerated sequence data analysis tools for assembly and analysis. The tool set comprises of many popular open source and proprietary software for a high performance, high throughput and high quality data analysis. The PaCGeE family of parallel NGS analysis tools are Cloud-MAQ, VELVET-P, EULER, ERANGE, BOWTIE, BFAST, MPI-BLAST, ChIP Seq Peak Finder etc | Mapping Hadoop |
Commercial | ||||
| PALMA | We present a novel approach based on large margin learning that combines accurate splice site predictions with common sequence alignment techniques. By solving a convex optimization problem, our algorithm -- called PALMA -- tunes the parameters of the model such that true alignments score higher than other alignments. We study the accuracy of alignments of mRNAs containing artificially generated micro-exons to genomic DNA. In a carefully designed experiment, we show that our algorithm accurately identifies the intron boundaries as well as boundaries of the optimal local alignment. It outperforms all other methods: for 5702 artificially shortened EST sequences from C. elegans and human it correctly identifies the intron boundaries in all except two cases. The best other method is a recently proposed method called exalin which misaligns 37 of the sequences. Our method also demonstrates robustness to mutations, insertions and deletions, retaining accuracy even at high noise levels. | RNA-Seq Alignment | Alignment | ||||
| PALMapper | Fast and Accurate Spliced Alignments of Sequence Reads. | Mapping | C++ | GPLv3 | |||
| PanGEA | Tool which enables a fast and user-friendly analysis of allele specific gene expression using the 454 technology. | RNA-Seq Allele-specific transcription SNP discovery |
Mozilla Public License | ||||
| PARalyzer | Tool to analyze cross-linking and immunoprecipitation data (CLIP) | Java | Commercial Freeware |
||||
| Partek Genomics Suite | Easy to use software providing A to Z analysis for all Next Generation Sequencing and Microarray data. | Allele-specific transcription RNA-Seq Quantitation Epigenomics Functional Genomics ChIP-Seq Alternative Splicing SNP discovery Small RNA transcriptome |
|||||
| PASH | Pash 3.0 performs sequence comparison and read mapping and can be employed as a module within diverse configurable analysis pipelines, including ChIP-Seq and methylome mapping by whole-genome bisulfite sequencing | Epigenomics DNA methylation |
Alignment Bisulfite mapping |
||||
| PASS | PASS performs fast gapped and ungapped alignments of short DNA sequences onto a reference DNA, typically a genomic sequence. It is designed to handle a huge amount of reads such as those generated by Solexa, SOLiD or 454 technologies. The algorithm is based on a data structure that holds in RAM the index of the genomic positions of seed" words (typically 11-12 bases) as well as an index of the precomputed scores of short words (typically 7-8 bases) aligned against each other. | Alignment | C++ | Linux Windows |
|||
| Patchwork | Patchwork is a bioinformatic tool for analyzing and visualizing allele-specific copy numbers and loss-of-heterozygosity in cancer genomes. The data input is in the format of whole-genome sequencing data which enables characterization of genomic alterations ranging in size from point mutations to entire chromosomes.
High quality results are obtained even if samples have low coverage, ~4x, low tumor cell content or are aneuploid. Patchwork is available in two formats. The first, named simply patchwork, takes BAM files as input whereas patchworkCG takes input from CompleteGenomics files. Detailed guides and information regarding these can be found in their respective tabs. |
Copy number estimation (Allele-specific!) |
Structural variation discovery | Allele specific copy numbers. | R | Linux Mac OS X |
|
| PatMaN | Patman searches for short patterns in large DNA databases, allowing for approximate matches. It is optimized for searching for many small pattern at the same time, for example microarray probes. | Mapping | |||||
| PE-Assembler | A simple 3' extension approach to assembling paired-end reads and capable of parallelization | De-novo assembly | Scaffolding | C++ | |||
| PeakAnalyzer | PeakAnalyzer is a set of applications for processing ChIP signal peaks. | Functional Genomics | ChIP-Seq analysis | Java C++ R |
|||
| PeakRanger | A multi-purpose, ultrafast ChIP Seq peak caller | ChIP-Seq | Peak calling | C++ | Artistic License | Linux Mac OS X |
|
| PeakSeq | ChIP-Seq | C Perl |
|||||
| PECAN | Alignment method practical for large genomic sequences. | Alignment | |||||
| PEMer | The package is composed of three modules, PEMer workflow, SV-Simulation and BreakDB. PEMer workflow is a sensitive software for detecting SVs from paired-end sequence reads. SV-Simulation randomly introduces SVs into a given genome and generates simulated paired-end reads from the ‘novel’ genome. Subsequent analysis with PEMer workflow on the simulated reads can facilitate parameterize PEMer workflow. BreakDB is a web accessible database developed to store, annotate and dsplay SV breakpoint events identified by PEMer and from other sources. | Structural variation | |||||
| PERalign | A probabilistic framework is described to predict the alignment to the genome of all paired-end read transcript fragments in a paired-end read dataset. Starting from possible exonic and spliced alignments of all end reads, our method constructs potential splicing paths connecting paired ends. An expectation maximization method assigns likelihood values to all splice junctions and assigns the most probable alignment for each transcript fragment. | RNA-Seq Alignment | C++ | Linux | |||
| PerM | PerM (Periodic Seed Mapping) uses periodic spaced seeds to significantly improve mapping efficiency for large reference genomes when compared to state-of-the-art programs. | Genomics SNP discovery |
Mapping | C++ | Apache License 2.0 | Linux | |
| Phred | The phred software reads DNA sequencing trace files, calls bases, and assigns a quality value to each called base. | Basecaller | C | Solaris IRIX AIX |
|||
| Phred Phrap Consed Cross match | The phred software reads DNA sequencing trace files, calls bases, and assigns a quality value to each called base. Phrap is a program for assembling shotgun DNA sequence data. Cross_match is a general purpose utility for comparing any two DNA sequence sets using a 'banded' version of swat. Consed/Autofinish is a tool for viewing, editing, and finishing sequence assemblies created with phrap. | Alignment Assembly Basecaller Smith-Waterman |
|||||
| Phymm | A classifier for metagenomic data, that has been trained on 539 complete, curated genomes and can accurately classify reads as short as 100 base pairs | Metagenomics | Hidden Markov Model | ||||
| PiCall | Identifies short indel polymorphisms in population sequencing data | InDel discovery Population genetics |
|||||
| PICS | PICS identifies binding event locations by modeling local concentrations of directional reads, and uses DNA fragment length prior information to discriminate closely adjacent binding events via a Bayesian hierarchical t-mixture model. | ChIP-Seq | R | ||||
| PileLine | PileLine is a flexible command-line toolkit for efficient handling, filtering, and comparison of genomic position (GP) files produced by next-generation sequencing experiments. PileLineGUI adds a graphical interface. | Viewer | Java | LGPL | |||
| Pindel | A pattern growth approach to detect break points of large deletions and medium sized insertions from paired end short reads. | InDel discovery Structural variation |
Split-read Mapping Localized reassembly/realignment |
C++ | Linux Mac OS X Windows |
||
| Pipeline Pilot | Analysis and workflow development of Next Generation Sequencing and gene expression. | Next Generation Sequencing Gene expression Sequence analysis SNP discovery |
General bioinformatics Mapping De-novo assembly Sequence analysis Variant detection Gene expression analysis RNA-Seq analysis ChIP-Seq analysis Genomics Comparative genomics Whole genome resequencing Sequence alignment |
Integrated solution wrapping custom and third party tools for integration analysis and reporting |
C++ Java Perl R Pilot Script |
Commercial | Linux Windows |
| PIQA | PIQA is a quality analysis pipeline designed to examine genomic reads produced by Next Generation Sequencing technology (Illumina G1 Genome Analyzer). It is a set of libraries for R. | Sequencing Quality Control | R | ||||
| PoissonSeq | Identify differential expressed genes | Differential Expression | |||||
| PolyBayesShort | A re-incarnation of the PolyBayes SNP discovery tool developed by Gabor Marth at Washington University. This version is specifically optimized for the analysis of large numbers (millions) of high-throughput next-generation sequencer reads, aligned to whole chromosomes of model organism or mammalian genomes. Developers at Boston College. | SNP discovery | Linux Linux 64 |
||||
| PoolHap | Computational tool for inferring haplotype frequencies from pooled samples when haplotypes are known. In future version, haplotype unknown analysis will be supported. | Mapping Regression. |
|||||
| PoPoolation | Toolbox specifically designed for the population genetic analysis of sequence data from pooled individuals. | Population genetics | Pooled samples | Perl R |
|||
| PoPoolation2 | PoPoolation2 allows to compare allele frequencies for SNPs between two or more populations and to identify significant differences. PoPoolation2 requires next generation sequencing data of pooled genomic DNA (Pool-Seq). It may be used for measuring differentiation between populations, for genome wide association studies and for experimental evolution. | Population genetics Genomics |
Pooled samples | Perl R |
|||
| PRICE | PRICE uses paired-read information to iteratively increase the size of existing contigs. | Assembly | C++ | ||||
| PRINSEQ | PRINSEQ is a sequence processing tool that can be used to filter, reformat and trim genomic and metagenomic sequence data. It generates summary statistics of the input in graphical and tabular formats that can be used for quality control steps. PRINSEQ is available as both standalone and web-based versions. | Metagenomics Genomics Metatranscriptomics |
Preprocessing Filtering Trimming |
Perl | GPLv3 | UNIX Mac OS X Windows |
|
| ProbeMatch | Matches a large set of oligonucleotide sequences against a genome database using gapped alignments | Mapping | Linux Mac OS X |
||||
| ProbHD | We present a new strategy for identifying heterozygous sites in a single individual by using a machine learning approach that generates a heterozygosity score for each chromosomal position. Our approach also facilitates the identification of regions with unequal representation of two alleles and other poorly sequenced regions. The availability of confidence scores allows for a principled combination of sequencing results from multiple samples. | Population genetics SNP discovery |
Perl R Python |
||||
| Proxygenes | We introduce a clustering method which significantly reduces the size of a metagenome dataset while maintaining a faithful representation of its functional and taxonomic content. | Metagenomics | Mapping Annotation |
||||
| Pybedtools | Python extension to BEDTools that allows use of all BEDTools programs directly from Python, as well as feature-by-feature manipulation, automatic handling of temporary files, and more. | Genomics | Mapping | See full description | Python | GPLv2 | Windows (Cygwin) Linux Linux 64 Mac OS X |
| PyroBayes | PyroBayes is a novel base caller for pyrosequences from the 454 Life Sciences sequencing machines. | SNP discovery | Basecaller | ||||
| PyroMap | PyroMap accurately maps pyrosequencing reads onto reference sequences using a selectively weighted Smith-Waterman (SW^2) algorithm to incorporate quality scores into alignment. | Mapping | Python | ||||
| PyroNoise | Clustering of pyrosequencing (454) data with noise model (AmpliconNoise) and chimaera removal (Perseus) for sequence diversity analysis. | Phylogenetics Metagenomics |
|||||
| QCALL | SNP detection and genotyping from low-coverage sequencing data on multiple diploid samples | SNP discovery | |||||
| Qpalma | QPalma is an alignment tool targeted to align spliced reads produced by Next Generation sequencing platforms | RNA-Seq Alignment | Alignment | Python C++ |
|||
| QSeq | QSeq is DNASTAR's Next-Gen application for RNA-Seq,ChIP-Seq, and miRNA alignment and analysis. | ChIP-Seq RNA-Seq MiRNA |
Integrated Solution Alignment Visualization Protein Binding Peak Detection |
Commercial | Mac OS X 10.6 with Parallels Desktop Windows |
||
| QSRA | Quality-value guided Short Read Assembler, created to take advantage of quality-value scores as a further method of dealing with error. Compared to previous published algorithms, our assembler shows significant improvements not only in speed but also in output quality. | De-novo assembly | Assembly | ||||
| QuadGT | QuadGT is a software package for calling single-nucleotide variants in four sequenced genomes: normal-tumor pairs coupled with parents. Genotypes are inferred using a joint model of parental variant frequencies, de novo germline mutations, and somatic mutations. The model quantifies the descent-by-modification relationships between the unknown genotypes by using a set of parameters in a Bayesian inference setting. | SNP discovery De novo Germline and Somatic mutations |
SNP calling Variant Calling |
Java | |||
| Quake | Program to detect and correct errors in DNA sequencing reads. Using a maximum likelihood approach incorporating quality values and nucleotide specific miscall rates, | Error correction | |||||
| QUAST | QUAST stands for QUality ASsessment Tool. It evaluates a quality of genome assemblies by computing various metrics and providing nice reports. | Quality Control Genomic Assembly Evaluation Sequence analysis |
Assembly QC Visualization Quality Control |
Data Visualization Assembly Quality Evaluation Detailed Reports |
Python C Perl |
GPLv2 | Linux Mac OS X |
| QuEST | QuEST is a Kernel Density Estimator-based package for analysis of massively parallel sequencing data from chromatin immunoprecipitations (ChIP-Seq or ChIPseq). | ChIP-Seq | C++ | GPLv2 | |||
| Quip | Aggressive compression of FASTQ and SAM/BAM files. | Data compression | C | BSD (3-clause) | any | ||
| R2R | R2R is a simple to use package for very sensitive analysis of short read sequence data obtained by NextGen sequencing techniques. R2R was developed in conjunction with data obtained on the Illumina GA platforms. R2R is written in simple Perl script and runs equally well under MS Windows, Mac OS and Linux/Unix operative systems. | SNP discovery | Alignment | Perl | |||
| R453Plus1Toolbox | Facilitates analysis of data from 454 sequencer in R/Bioconductor. | R | |||||
| RACA | Reference-Assisted Chromosome Assembly (RACA) | ||||||
| RApiD | Tools for processing restriction site associated DNA sequencing. | SNP discovery | Perl C++ |
GPLv3 | |||
| RAPSearch | Fast protein similarity search tool for short reads that utilizes a reduced amino acid alphabet and suffix array to detect seeds of flexible length. | Metagenomics | Alignment | C++ | GPLv3 | ||
| Ray | de novo genome assembly is now a challenge because of the overwhelming amount of data produced by sequencers. Ray assembles reads obtained with new sequencing technologies (Illumina, 454, SOLiD) using MPI 2.2 -- a message passing inferface standard. | De-novo assembly | Assembly | * MPI 2.2 * ISO/IEC C++ 2003 * de Bruijn * paralleled * Illumina data | C++ | GPL | Linux POSIX |
| RazerS | RazerS allows the user to align sequencing reads of arbitrary length using either the Hamming distance or the edit distance. The tool can work either lossless or with a user-defined loss rate at higher speeds. | Mapping Read alignment |
SWIFT Filter Myers Bitvector Algorithm |
Gapped alignment paired-end mapping |
C++ | GPLv3 | UNIX Mac OS X Windows |
| RDiff | rDiff is an open source tool for accurate detection of differential RNA processing from RNA-Seq data. It implements two statistical tests to detect changes of the RNA processing between two samples. rDiff.parametric is a powerful test, which can be applied for well annotated organisms to detect changes in the relative abundance of isoforms. rDiff.nonparametric is an alternative when the annotation is incomplete or missing. | RNA-Seq Alignment Differential RNA processing regulation Alternative Splicing RNA-Seq Transcriptomics |
Statistical testing | Python Matlab |
Open Source | Linux Mac OS X |
|
| RDP Pyrosequencing Pipeline | The Ribosomal Database Project's Pyrosequencing Pipeline aims to simplify the processing of large 16s rRNA sequence libraries obtained through pyrosequencing. This site processes and converts the data to formats suitable for common ecological and statistical packages such as SPADE, EstimateS, and R. | Metagenomics | Alignment Database submission preparation Format conversion |
browser based | |||
| Readaligner | A tool for mapping (short) DNA reads into reference sequences. | Mapping | |||||
| ReadDepth | Detects copy number aberrations in deep sequencing data | Copy number estimation | R | Apache License 2.0 | |||
| REAL | REad ALigner for Next-Generation sequencing reads | Mapping | C++ | GPLv3 | Linux | ||
| Reaper | Reaper is a program for demultiplexing, trimming and filtering short read sequencing data. | Next Generation Sequencing | Filtering Adapter Removal (software) Trimming Sample Barcoding Sequencing Quality Control QC |
Memory efficient and fast. | C | GPL v3 | Linux UNIX Mac OS X |
| Reconciliator | The tool for merging assemblies | Assembly | Perl | Linux | |||
| RECOUNT | Probabilistic tag count error correction for next generation sequencing data (Solexa/Illumina). | RNA-Seq Quantitation | Expectation Maximization | C++ | GPL | Linux | |
| RefCov | WashU Reference Coverage tool for analyzing the depth, breadth, and topology of sequencing coverage | Copy number estimation | |||||
| Repitools | Toolbox of procedures to interrogate and visualize epigenomic data. Part of BioConductor | ChIP-Seq ChIP-on-chip |
Sequencing Quality Control Visualization Methylation Calling Statistical testing |
R | LGPL | ||
| Reptile | A new algorithm for short read error correction that harvests information from k-spectrum and read decomposition | Genomics | Sequencing Quality Control | C++ | GPL Boost | ||
| ReSeqSim | A simulation toolbox that will help us optimize the combination of different technologies to perform comparative genome re-sequencing, especially in reconstructing large structural variants (SVs). | Structural variation | Mapping Simulation |
||||
| RGA | Reference-guided assembler | SNP discovery | Assembly | ||||
| RiboPicker | riboPicker is a publicly available tool that is able to automatically identify and efficiently remove rRNA-like sequences from metatranscriptomic and metagenomic datasets. riboPicker is available as both standalone and web-based versions. | Metagenomics Genomics Metatranscriptomics |
Preprocessing RRNA filtering |
Perl C |
GPLv3 | UNIX Mac OS X |
|
| RMAP | Assembles 20 - 64 bp Solexa reads to a FASTA reference genome. By Andrew D. Smith and Zhenyu Xuan at CSHL. (published in BMC Bioinformatics). POSIX OS required. | DNA methylation | Mapping Bisulfite mapping |
GPLv3 | Linux Mac OS X |
||
| RNA | A randomized Numerical Aligner for Accurate alignment of NGS reads | Read alignment | Mapping Hash Table Based |
Fast Accurate |
C++ | GPL v3 | Linux UNIX Windows Mac OS X |
| RNA-MATE | A recursive mapping strategy for high-throughput RNA-sequencing data. | RNA-Seq Alignment RNA-Seq Quantitation |
Colorspace | ||||
| RNASEQR | a streamlined and accurate RNA-seq sequence analysis program | Alternative Splicing | Read mapping | ||||
| Rnnotator | Automated software pipeline that generates transcript models by de novo assembly of RNA-Seq data without the need for a reference genome | De novo transcriptome assembly | Commercial Freeware |
||||
| RobiNA | RobiNA is a Java GUI that enables the user to graphically call differentially expressed genes. For read mapping it relies on bowtie and for the differntial expression analysis it builds on an R backbone running DESeq and edgeR. | RNA-Seq | Differentially expressed gene identification | Trimming differential expression graphical display |
Java R |
GPL | Windows Linux Mac OS X |
| Rolexa | Allows fast and accurate base calling of Solexa's fluorescence intensity files and the production of informative diagnostic plots. | Sequencing | Basecaller | R | |||
| RSAT peak-motifs | A workflow combining a series of time- and memory-efficient motif analysis tools to extract motifs from full-size collections of peaks as generated by ChIP-seq, ChIP-chip or other ChIP-X technologies. | ChIP-Seq Regulatory genomics Epigenomics |
Motif discovery Motif scanning Motif comparison |
Perl CGI Python C |
Commercial Freeware |
UNIX Mac OS X Linux |
|
| RSEM | We present a generative statistical model and associated inference methods that handle read mapping uncertainty in a principled manner. Through simulations parameterized by real RNASeq data, we show that our method is more accurate than previous methods. Our improved accuracy is the result of handling read mapping uncertainty with a statistical model and the estimation of gene expression levels as the sum of isoform expression levels. | RNA-Seq Alignment RNA-Seq Quantitation |
C++ | ||||
| RSEQtools | RSEQtools includes a format specification for RNA-Seq data that provides confidentially-aware; data summaries as well as several tools for performing common analyses: expression measurements (e.g. RPKMs), creation of signal tracks, segmentation, annotation manipulations, etc. | RNA-Seq Quantitation | C | Creative Commons - Attribution; Non-commercial 2.5 | Mac OS X UNIX Linux |
||
| Rsolid | Rsolid implements a version of the quantile normalization algorithm that transforms the intensity values before calling colors | Colorspace Basecaller |
R C |
||||
| Rsubread | Rsubread is Bioconductor R package, which provides facilities to performing read alignments using the Subread aligner. It also includes other functionalities such as featureCounts read summarization function. | Next-generation sequencing | Read mapping Read summarization Quality assessement |
R C |
GPL v3 | Linux 64 Mac OS X; x86 64 Mac OS X |
|
| RTG Investigator | Comprehensive analysis pipelines powered with unique mapping speed and sensitivity deliver deep genomic analysis in variant detection and metagenomic applications with Illumina, Ion Torrent, Complete Genomics and Roche 454 data sets. | Exome and whole genome variant detection Metagenomics SNP discovery InDel discovery |
Mapping Alignment Translated nucleotide search K-mer analysis Species frequency estimation Contaminant filtering Read depth analysis |
Commercial Freeware |
Linux Mac OS X Windows |
||
| RUbioSeq | RUbioSeq has been developed to facilitate the primary and secondary analysis of resequencing projects by providing an integrated software suite of parallelized pipelines to detect exome variants (SNVs and CNVs) and to perform Bisulfite-seq analyses automatically. RUbioSeq's variant analysis results have been already validated and published. AVAILABILITY: http://rubioseq.sourceforge.net/ | Exome analysis Copy number estimation Bisulfite Sequencing |
Somatic variant calling | Perl | UNIX | ||
| S-MART | S-MART manages your RNA-Seq and ChIP-Seq data. | RNA-Seq ChIP-Seq |
Python Java |
Linux Mac OS X Windows |
|||
| SAMMate | GUI for processing SAM/BAM and BED files. The software allows users to accurately estimate gene expression scores using short reads originating from both exons and exon-exon junctions, to generate wiggle files for visualization in UCSC genome browser, and to generate an alignment statistics report. | RNA-Seq Quantitation | Sequence analysis | Java | GPLv3 | Windows Mac OS X |
|
| Samscope | Samscope is a lightweight SAM/BAM file viewer that makes visually exploring next generation sequencing data as intuitive as Google Maps. Samscope uses multiple layers to simultaneously (or sequentially) view SAM/BAM related features like coverage or allele frequency, or ChIP-SEQ features like polarity from as many files as you like. The paging-friendly binary file layout makes it feasible to browse data sets far larger than the system's available RAM. | ChIP-Seq RNA-Seq Genomics |
Visualization Read mapping SAMtools |
C++ | AGPL | POSIX Linux |
|
| SAMStat | SAMStat is an efficient C program for displaying statistics of large sequence files. | Sequencing Quality Control | C | GPLv3 | UNIX | ||
| SAMtools | Various utilities for processing alignments in the SAM format, including variant calling and alignment viewing. | SNP discovery | Simulation Programming Library Assembly visualization |
Integrated solution API |
C | MIT | |
| Savant Genome Browser | Savant is a genome browser which combines visualization of HTS and other genome-based data with powerful analytic tools. | Genomics | Visualization Viewer Alignment viewer |
Plugin framework Bookmarking Table View fast memory efficient |
Java | Apache License 2.0 | Windows Linux all supporting JVM Mac OS X |
| Scaffolder | Edit your genome sequence using a simple human readable syntax. Manage contig positions and add inserts all in a plain text file. | Scaffolding | Ruby | MIT | Linux Mac OS X |
||
| SCALCE | SCALCE (skeɪlz) is fast FASTQ compression utility that utilizes locally consistent parsing for better compression rate. It achieves around 2X more compression than gzip alone. | Genomics | Data compression | FASTQ file compression | C | Linux 64 Linux |
|
| SCARF | Scaffolded and Corrected Assembly of Roche 454 (SCARF) is a next-generation sequence assembly tool for evolutionary genomics that is designed especially for assembling 454 EST sequences against high-quality reference sequences from related species. | Assembly | GPLv3 | ||||
| Scripture | Tool for assembling transcriptome from paired-end Illumina RNA-Seq data | RNA-Seq Alignment | |||||
| SEAL | Read mapper and duplicate remover. | Mapping Hadoop |
Python C++ Java |
GPLv3 | |||
| SEECER | Error correction for RNA-Seq data | RNA Seq analysis | supports multicore processors | C | |||
| SEED | Tool to cluster sequence reads prior to assembly or other operations. | Metagenomics | Clustering | C++ | Mac OS X Linux Windows |
||
| Segemehl | Map short reads to known genome with tolerance for mismatches and indels using suffix arrays for high accuracy matching | Genomics | Mapping | fast precise low cost for high-error matching |
C C++ |
||
| Segtor | A software tool to annotate large sets of genomic coordinates, intervals, SNVs, indels and translocations with respect to known genes. | SNP Annotation | Annotation | SNP annotation | Perl C |
Non-commercial | Linux Mac OS X |
| Seq2HLA | seq2HLA is a computational tool to determine Human Leukocyte Antigen (HLA) directly from existing and future short RNA-Seq reads. It takes standard RNA-Seq sequence reads in fastq format as input, uses a bowtie index comprising known HLA alleles and outputs the most likely HLA class I and class II types, a p-value for each call, and the expression of each class. | Transcriptomics | Mapping Read alignment HLA typing |
Python R |
Unix Mac OS X |
||
| SeqAn | C++ template library with many sequence analysis algorithms and datastructures. | Sequence analysis Genomics Phylogenetics |
Programming Library | C++ | BSD (3-clause) | UNIX Mac OS X Windows |
|
| SeqBuster | SeqBuster, a web-based bioinformatic tool offering a custom analysis of deep sequencing data at different levels, with special emphasis on the analysis of miRNA variants or isomiRs and the discovering of new small RNAs. | Small RNA transcriptome MiRNA |
Mapping Annotation |
Annotation and detection of miRNAs and other small RNAs | Java R |
Commercial Freeware |
Mac OS X Linux |
| SeqCons | SeqCons is an open source consensus computation program for Linux and Windows. The algorithm can be used for de novo and reference-guided sequence assembly. | Assembly | Linux Windows |
||||
| SeqEM | Genotype-calling algorithm that estimates parameters underlying the posterior probabilities in an adaptive way rather than arbitrarily specifying them a priori. The algorithm applies the well-known EM algorithm to an appropriate likelihood for a sample of unrelated individuals with next-generation sequence data, leveraging information from the sample to estimate genotype probabilities and the nucleotide-read error rate. | SNP discovery | Expectation Maximization | ||||
| SeqGSEA | Gene Set Enrichment Analysis (GSEA) of RNA-Seq Data: integrating differential expression and splicing | Biomedical Sciences Genomics RNA-Seq |
Statistics Functional analysis Gene set enrichment analysis |
integrative analysis | R | GPL (>= 3) | any |
| SeqMan NGen | Sequence assembly software using traditional, next-gen, and third-gen techonologies. Subsequent analysis of the assembly, including SNP discovery, coverage evaluation and consensus annotation is provided through full integration with Lasergene. | Genomics De-novo assembly De novo transcriptome assembly Whole Genome Resequencing SNP discovery InDel discovery ChIP-Seq RNA-Seq Alignment |
Mapping Assembly Alignment Paired End |
Commercial | Windows Mac OS X Linux |
||
| SeqMap | SeqMap is a tool for mapping large amount of short sequences to the genome. | Mapping | command line parallel execution |
Mac OS X Windows Linux |
|||
| SeqMINER | seqMINER is an integrated portable ChIP-seq data interpretation platform with optimized performances for efficient handling of multiple genomewide datasets. seqMINER allows comparison and integration of multiple ChIP-seq datasets and extraction of qualitative as well as quantitative information. seqMINER can handle the biological complexity of most experimental situations and proposes supervised methods to the user in data categorization according to the analysed features. In addition, through multiple graphical representations, seqMINER allows visualisation and modelling of general as well as specific patterns in a given dataset. Moreover, seqMINER proposes a module to quantitatively analyse correlations and differences between datasets. | ChIP-Seq | Java | GPLv3 | platform-independent | ||
| SeqMonk | A tool to visualise and analyse high throughput mapped sequence data | Genomics Epigenomics |
Visualization Assembly visualization Statistical testing Alignment viewer |
Genome Viewer Data Visualisation Data Quantitation filtering and analysis |
Java | GPLv3 | Windows Mac OS X Linux |
| SeqPrep | Strips adapters and optionally merges overlapping paired-end (or paired-end contamination in mate-pair libraries) illumina style reads. | Genomics De-novo assembly |
Merges overlapping paired-end reads strips adapters off of reads. |
C | MIT | POSIX | |
| SeqSaw | A package for mapping of spliced reads and unbiased detection of novel splice junctions from RNA-seq data. | RNA-Seq Alternative Splicing |
Mapping Alignment |
Short Spliced Sequence Mapping Tool | C++ | GPL | Linux |
| SeqSeg | An algorithm to identify chromosomal breakpoints using massively parallel sequence data | Copy number estimation | Matlab | ||||
| SeqSite | SeqSite is an efficient and easy-to-use software tool implementing a novel method for identifying and pinpointing transcription factor binding sites. It first detects transcription factor binding regions by clustering tags and statistical hypothesis testing, and locates every binding site in detected binding regions by modeling the tag profiles. It can pinpoint closely spaced adjacent binding sites from ChIP-seq data. This software is coded in C/C++, and supports major computer platforms. | ChIP-Seq Functional Genomics Regulatory element annotation |
ChIP-Seq Peak calling Statistical Modelling Statistical testing |
stand-along software tool can run on major computer platforms |
C C++ |
GPL | Linux Mac OS X Windows |
| SeqSolve | Simple analysis of Next Generation Sequencing data. | RNA-Seq ChIP-Seq Transcriptomics MiRNA NcRNAs SRNA Differential Expression Alternative Splicing New gene discovery Quality Control |
SAMtools Cufflinks IGV MACS Tibco Spotfire |
User-friendly Scientifically relevant Reliable Scalable |
Commercial | Windows Linux |
|
| SeqTrim | A pipeline for preprocessing sequences. | Trimming | |||||
| Sequedex | Sequedex classifies short reads for phylogeny and function at high speed | Metagenomics Phylogenetics Transcriptomics genetics |
Sequence analysis Signature methods |
Fast protein fragments identified |
Java Python |
Commercial Freeware |
Linux 64 Mac OS X |
| SequenceVariantAnalyzer | DNA sequence information underpins genetic research, enabling discoversies of important biological or medical benefit. Compared with previous discovery strategies, a whole-genome sequencing study is no longer constrained by differing patterns of linkage disequilibrium, thus, in theory, is more possible to directly identify the gentic variants contributing to biological traits or medical outcomes.
The rapidly evolving high-throughput DNA sequencing technologies have now allowed the fast generation of large amount of sequence data for the purpose of performing such whole-genome sequencing studies, at a reasonable cost. SequenceVariantAnalyzer, or SVA, is a software tool that we have been developing to analyze the genetic variants identified from such studies. URL: http://www.svaproject.org/ |
Personal genomics Genomics Sequence analysis |
Annotation Genetic variation annotation Genome browser |
Variant annotation and analysis | Java | Linux 64 | |
| Sequencher | Desktop alignment software now with plugins to MAQ and GSNAP for NGS sequence date | De-novo assembly SNP discovery |
Assembly Alignment |
Bisulfite sequencing consensus sequence generation and export SNP/InDel/Read Error display and search |
Commercial | Windows Mac OS X |
|
| SeqWare | SeqWare provides tools designed to support massively parallel sequencing technologies. | LIMS Workflow |
Java | GPLv3 | Linux | ||
| SeqWords | SeqWords is a featherweight object for the calculation of n-mer word occurrences in a single sequence. | K-mer analysis | Part of BioPerl | Perl | Perl artistic licence | ||
| SESAME | Gnotyping of multiplexed individuals for several markers based on NGS amplicon sequencing. | Genotyping Targeted resequencing |
GPLv3 | Windows Linux |
|||
| SEWAL | Processing of deep sequencing data from in vitro selection experiments | In vitro selection | |||||
| Sff2fastq | The program 'sff2fastq' extracts read information from a SFF file, produced by the 454 genome sequencer, and outputs the sequences and quality scores in a FASTQ format. | Conversion | Linux | ||||
| SGA | SGA is a de novo assembler designed to assemble large genomes from high coverage short read data. | Assembly | C++ | GPLv3 | Linux | ||
| SHARCGS | SHARCGS is a suitable tool for fully exploiting novel sequencing technologies by assembling sequence contigs de novo with high confidence and by outperforming existing assembly algorithms in terms of speed and accuracy. | De-novo assembly | Assembly | Perl | Linux | ||
| SHE-RA | The SHE-RA software turns error-prone short reads into Sanger-quality composite reads. | Error correction | Open Source | ||||
| Sherman | bisulfite-treated Read FastQ Simulator | Genomics Bisulfite Sequencing DNA methylation |
Simulation | Perl | GPLv3 | Linux Mac OS X |
|
| ShoRAH | Inference of a population from a set of short reads. The package contains programs that support mapping of reads to a reference genome, correcting sequencing errors by locally clustering reads in small windows of the alignment, reconstructing a minimal set of global haplotypes that explain the reads, and estimating the frequencies of the inferred haplotypes. | Metagenomics | Haplotype reconstruction Mapping |
GPLv3 | Linux Mac OS X |
||
| Shore | Analysis suite for short read data. | Structural variation SNP discovery |
Mapping | Linux Mac OS X POSIX |
|||
| … further results | |||||||