SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Mapping reads to reference genome + count reads of genes cumulonimbus RNA Sequencing 12 10-02-2013 09:07 AM
RNA-seq READS mapping on Reference Genome kumardeep Introductions 6 04-21-2012 11:46 PM
mapping 454 reads to a reference genome query Bioinformatics 33 02-09-2011 07:36 AM
Tophat...Mapping reads against Reference with Bowtie [FAILED] Brajbio Bioinformatics 0 06-02-2010 01:33 AM
Difficulty mapping reads with non-reference allele? krobison Genomic Resequencing 3 10-09-2009 11:48 AM

Reply
 
Thread Tools
Old 04-24-2015, 12:33 AM   #1
svos
Member
 
Location: Germany

Join Date: Feb 2014
Posts: 16
Default How to Map Amplicon Reads to Reference

Hi all,


I am wondering how to map paired-end amplicon reads to the reference properly and if I really understand amplicon sequencing itself...

Correct me if I'm wrong but paired-end reads must originate from the very same amplicon, right? That means the insert size (of untrimmed reads) must be equal to the amplicon size. And it means that the first base of fwd reads must match the amplicon start position, while the last base of rev reads must match the amplicon stop position.

The question is now how to use this information? Is there a way to tell the mapper (e.g. bwa) which amplicons where used in order to improve mapping?

The problem is, I see a lot of reads starting at positions which do not correlate with the amplicon start/stop positions which tells me that they are indeed mapped to the wrong position! In the end I have a lot of false positive variants...


Thanks for any suggestions!

Sebastian

Last edited by svos; 04-24-2015 at 12:41 AM.
svos is offline   Reply With Quote
Old 04-24-2015, 12:50 AM   #2
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Aligners don't generally care which bases mismatch the reference, so I don't see a good way of forcing an aligner to make the first base much more important than others. Heng Li can correct me if I'm wrong, but I think your best bet is to post-filter the reads, and remove those with mismatches in the first 5 bp, if you want to enforce a rule that (for example) the first the first 5 bp must match exactly. I'm not sure that's a good idea, though. Bear in mind that the first few BP of Illumina reads are lower quality than the rest, so some mismatches are expected.

Also, the first base of read 1 is the leftmost base and the first base of read 2 is the rightmost base (relative to the molecule being sequenced). So it's the first base of read 2 that should match the stop position (reverse-complemented), not the last base.
Brian Bushnell is offline   Reply With Quote
Old 04-24-2015, 01:12 AM   #3
svos
Member
 
Location: Germany

Join Date: Feb 2014
Posts: 16
Default

Thank you for your suggestion. The point I mentioned is not on how well the bases at 5' or 3' end match (or mismatch) the reference, it is only about WHERE they are mapped.

Of course I can filter those reads out (and I will as long as I don't have a better solution), but I will lose the information of these reads then...

In the BAM file, the last base of a rev read is the first base that was read in the sequence (as it was reverse-complemented again )
svos is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:51 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO