We have just made available the tool ASAP, an allele-specific alignment pipeline for Next-Gen Sequencing experiments with mixed genetic backgrounds.
Heterozygous differences in a sequencing sample can be detected by aligning reads to a reference genome and identifying allele-specific events from a SNP pileup file. This can, however, introduce a bias in favour of the reference allele (this has already been discussed before http://seqanswers.com/forums/showthread.php?t=6148). ASAP analyses allelic differences in NGS data sets in a less biased way by aligning sequences to two reference genomes in parallel using the short-read aligner Bowtie. ASAP produces three different output files, two for genome-specific alignments as well as one for alignments which both genomes have in common.
• Alignments against two genomes in a single step
• Supports single-end and paired-end read alignments of variable length
• Alignment seed length, number of mismatches, insert size etc. are adjustable
• Individual output files for genome 1-specific or genome 2-specific alignments and alignments in common
ASAP can be useful to detect allelic imbalances in samples which are of different genetic origin. Obvious examples would be studying imprinted regions in ChIP-Seq or RNA-Seq experiments, or genomic interactions (e4C) in mouse strains with a mixed genetic background.
As a limitation, both genomes and their variations have to be known up front, and - as it performs ungapped alignments using Bowtie - a certain fraction of reads spanning splice-junctions in RNA-Seq experiments might not get aligned.
ASAP as well as a detailed manual can be obtained from www.bioinformatics.bbsrc.ac.uk/projects/. I will also present ASAP with a poster at ISMB in Vienna next week, where I will be happy to take any comments or questions (or bug reports).
Heterozygous differences in a sequencing sample can be detected by aligning reads to a reference genome and identifying allele-specific events from a SNP pileup file. This can, however, introduce a bias in favour of the reference allele (this has already been discussed before http://seqanswers.com/forums/showthread.php?t=6148). ASAP analyses allelic differences in NGS data sets in a less biased way by aligning sequences to two reference genomes in parallel using the short-read aligner Bowtie. ASAP produces three different output files, two for genome-specific alignments as well as one for alignments which both genomes have in common.
• Alignments against two genomes in a single step
• Supports single-end and paired-end read alignments of variable length
• Alignment seed length, number of mismatches, insert size etc. are adjustable
• Individual output files for genome 1-specific or genome 2-specific alignments and alignments in common
ASAP can be useful to detect allelic imbalances in samples which are of different genetic origin. Obvious examples would be studying imprinted regions in ChIP-Seq or RNA-Seq experiments, or genomic interactions (e4C) in mouse strains with a mixed genetic background.
As a limitation, both genomes and their variations have to be known up front, and - as it performs ungapped alignments using Bowtie - a certain fraction of reads spanning splice-junctions in RNA-Seq experiments might not get aligned.
ASAP as well as a detailed manual can be obtained from www.bioinformatics.bbsrc.ac.uk/projects/. I will also present ASAP with a poster at ISMB in Vienna next week, where I will be happy to take any comments or questions (or bug reports).
Comment