SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
tophat (or tophat-fusion) v2.0.6 error on: Joining segment hits (long_spanning_reads) AndrewUzilov Bioinformatics 6 01-17-2013 11:25 PM

Reply
 
Thread Tools
Old 01-14-2013, 01:15 AM   #1
EGrassi
Member
 
Location: Turin, Italy

Join Date: Oct 2010
Posts: 66
Default Tophat + htseq_count

Hello, I'm performing RNAseq analyses and I've stumbled upon some puzzling results.
I aligned some data with tophat2 (default settings) and as long as the results were disappointing (only about 5% of properly paired reads) I changed the -r and --mate-std-dev parameters and gotten to 60% (I know, still not very high). I ran htseq_count on the resulting bam alignments and comparing the two results I see no differences.
Am I missing something? Does htseq_count use the information about properly paired reads or not? By these results I am prone to say no, I will check the code...

Last edited by EGrassi; 01-14-2013 at 04:20 AM.
EGrassi is offline   Reply With Quote
Old 01-15-2013, 01:08 AM   #2
EGrassi
Member
 
Location: Turin, Italy

Join Date: Oct 2010
Posts: 66
Default

A quick check on the htseq_count code tells me that it never uses the reads "mate_aligned" attribute and just considers all of the paired reads. Does this seem a strange behaviour only to me? I don't see in any place a check on wheter the two reads fall at a sensible distance to be reliably considered in the counts.
EGrassi is offline   Reply With Quote
Old 01-15-2013, 11:24 PM   #3
Simon Anders
Senior Member
 
Location: Heidelberg, Germany

Join Date: Feb 2010
Posts: 994
Default

The "mate_aligned" bit in the FLAG field indicates, in my reading of the SAM spec, that an alignment for the mate is given in the SAM file, not that this alignment is considered plausible. If TopHat really changes the mate_aligned field according to the distance, I'd consider this a very odd behaviour. In my opinion, it should set the alignment quality (5th field in the SAM file) to a low value to indicate that an alignment is reported but should not be trusted.

htseq-count, by the way, filters by the alignment quality only if you use the -a option. I guess I should change this to be the default.
Simon Anders is offline   Reply With Quote
Old 01-15-2013, 11:30 PM   #4
EGrassi
Member
 
Location: Turin, Italy

Join Date: Oct 2010
Posts: 66
Default

Quote:
Originally Posted by Simon Anders View Post
The "mate_aligned" bit in the FLAG field indicates, in my reading of the SAM spec, that an alignment for the mate is given in the SAM file, not that this alignment is considered plausible. If TopHat really changes the mate_aligned field according to the distance, I'd consider this a very odd behaviour. In my opinion, it should set the alignment quality (5th field in the SAM file) to a low value to indicate that an alignment is reported but should not be trusted.
As long as the samtools flagstat percentage of properly paired reads gotten on the accepted_hits changed setting the -r tophat parameter I believed that the ones reported as not properly aligned were in the sam file but should not be considered as aligned in the analyses.

(filtering on quality only with an option is fine in my opinion by the way ).
EGrassi is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:20 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO