SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
Why reads in unmapped.bam still align to reference genome? SpreeFu Bioinformatics 7 09-28-2014 09:14 PM
Recommended aligner to use with a draft reference genome and paired end reads N311V Bioinformatics 2 07-16-2013 05:32 PM
Finishing a de novo assembly of bacteria genome from single-end reads tdoniger Bioinformatics 3 02-27-2013 11:32 AM
Align long sequences to genome reference Amative Bioinformatics 2 07-17-2012 06:05 AM
how to align the contigs to the reference genome jjjscuedu Bioinformatics 1 06-05-2012 08:39 AM

Reply
 
Thread Tools
Old 04-23-2014, 10:27 AM   #1
kcm.eid
Junior Member
 
Location: India

Join Date: Oct 2012
Posts: 2
Default why single end reads align to the reference genome reversely

Hi guys,
I ve 76bp long RNAseq single end reads. I ve used Tophat to align these reads onto human genome sequence. After aligning the reads, I browsed the alignment file using samtools tview. As the reads are single end, they should align with the reference genome only in forward direction. Most of the reads are in forwad direction. But there are many lowercase reads, I suppose they are the reads aligned to the reference in reverse direction.

I am confused what should I do. I think these reads have aligned to the reference in reverse direction due to sequence homology (inverted repeats). I will use the BAM file for SNP calling in my downstream analysis. I m not confident over thse alignments. What should I do? Should I discard them? Is there any tool to discard such reads?

Thanks.
kcm.eid is offline   Reply With Quote
Old 04-23-2014, 11:22 AM   #2
mastal
Senior Member
 
Location: uk

Join Date: Mar 2009
Posts: 667
Default

Unless the library-prep was done using a stranded protocol, reads will align to both the forward and reverse strand.

However, as you mentioned, reads that are in lower case may well represent reads that are aligned to repeats.
mastal is offline   Reply With Quote
Old 04-23-2014, 12:55 PM   #3
dpryan
Devon Ryan
 
Location: Freiburg, Germany

Join Date: Jul 2011
Posts: 3,480
Default

Reads will align in both the forward and reverse orientation regardless. The strandedness of the library-prep will affect the likelihood of seeing anti-sense reads. BTW, I don't think tophat ever produces "lower-case" alignments (there's only one case in BAM files and that's what tophat produces). If the sequence in your reference genome is lower case, then yes that's probably indicative of a soft-masked repeat, though there are a LOT of those so I wouldn't read too much into it.

Perhaps you can post a few example reads that you're concerned about and we can then assess if there's actually anything to worry about.
dpryan is offline   Reply With Quote
Old 04-24-2014, 05:33 AM   #4
Chipper
Senior Member
 
Location: Sweden

Join Date: Mar 2008
Posts: 324
Default

view uses lowercase for reverse reads. Since genes are on both strands this is expected even for stranded sequencing.
Chipper is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:38 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO