Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
Mapping reads to reference genome + count reads of genes cumulonimbus RNA Sequencing 12 10-02-2013 08:07 AM
RNA-seq READS mapping on Reference Genome kumardeep Introductions 6 04-21-2012 10:46 PM
mapping 454 reads to a reference genome query Bioinformatics 33 02-09-2011 06:36 AM
Tophat...Mapping reads against Reference with Bowtie [FAILED] Brajbio Bioinformatics 0 06-02-2010 12:33 AM
Difficulty mapping reads with non-reference allele? krobison Genomic Resequencing 3 10-09-2009 10:48 AM

Thread Tools
Old 04-23-2015, 11:33 PM   #1
Location: Germany

Join Date: Feb 2014
Posts: 16
Default How to Map Amplicon Reads to Reference

Hi all,

I am wondering how to map paired-end amplicon reads to the reference properly and if I really understand amplicon sequencing itself...

Correct me if I'm wrong but paired-end reads must originate from the very same amplicon, right? That means the insert size (of untrimmed reads) must be equal to the amplicon size. And it means that the first base of fwd reads must match the amplicon start position, while the last base of rev reads must match the amplicon stop position.

The question is now how to use this information? Is there a way to tell the mapper (e.g. bwa) which amplicons where used in order to improve mapping?

The problem is, I see a lot of reads starting at positions which do not correlate with the amplicon start/stop positions which tells me that they are indeed mapped to the wrong position! In the end I have a lot of false positive variants...

Thanks for any suggestions!


Last edited by svos; 04-23-2015 at 11:41 PM.
svos is offline   Reply With Quote
Old 04-23-2015, 11:50 PM   #2
Brian Bushnell
Super Moderator
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707

Aligners don't generally care which bases mismatch the reference, so I don't see a good way of forcing an aligner to make the first base much more important than others. Heng Li can correct me if I'm wrong, but I think your best bet is to post-filter the reads, and remove those with mismatches in the first 5 bp, if you want to enforce a rule that (for example) the first the first 5 bp must match exactly. I'm not sure that's a good idea, though. Bear in mind that the first few BP of Illumina reads are lower quality than the rest, so some mismatches are expected.

Also, the first base of read 1 is the leftmost base and the first base of read 2 is the rightmost base (relative to the molecule being sequenced). So it's the first base of read 2 that should match the stop position (reverse-complemented), not the last base.
Brian Bushnell is offline   Reply With Quote
Old 04-24-2015, 12:12 AM   #3
Location: Germany

Join Date: Feb 2014
Posts: 16

Thank you for your suggestion. The point I mentioned is not on how well the bases at 5' or 3' end match (or mismatch) the reference, it is only about WHERE they are mapped.

Of course I can filter those reads out (and I will as long as I don't have a better solution), but I will lose the information of these reads then...

In the BAM file, the last base of a rev read is the first base that was read in the sequence (as it was reverse-complemented again )
svos is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 02:19 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO