SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Exclude chrM, chrUn* from reference // htseq-count warning on chrM ocs Bioinformatics 10 11-02-2011 11:21 AM
cuffcompare warning message Robin Bioinformatics 8 10-26-2011 11:14 AM
htseq-count with warning for every read to represent all of zero counts in output hibachings2013 RNA Sequencing 10 07-15-2011 11:19 AM
Warning message- BFAST CS Student General 2 05-26-2011 11:56 AM
Orthomcl warning message joscarhuguet Bioinformatics 0 04-20-2011 09:13 AM

Reply
 
Thread Tools
Old 10-28-2013, 01:53 PM   #21
adaigle
Junior Member
 
Location: Boston

Join Date: Sep 2013
Posts: 6
Default

Oh sorry, there should be a bunch of colons in there. That's because I was fooling around with it when I was getting errors earlier about an illegal colon in the optional fields. Here is the real line:

Y71VY:00004:00106 16 chr10 62135255 0 12M * 0 0 GGGGGGGGAGGG 6666666,,'** ZP:B:f,0.0155953,0.00769459,0.0049578 ZM:B:s,266,0,242,0,0,252,0,278,506,0,0,550,262,278,0,0,226,0,260,0,26,38,42,30,104,0,0,0,764,0,0,0,70,0,5224,0,5866,0,0,0,0,0,0,0,120,124,14,122,0,46 ZF:i:28 RG:Z:Y71VY.IonXpress_014 PG:Z:tmap MD:Z:12 NM:i:0 AS:i:12 XA:Z:map4-1 XS:i:12

Does this still look screwed up?
adaigle is offline   Reply With Quote
Old 11-13-2013, 04:55 AM   #22
bbl
Member
 
Location: London

Join Date: Jul 2010
Posts: 16
Default

How does HTseq-count count paired-end reads which are mapped on different chromosomes or genes? Or should they be filtered out prior htseq-count run?
bbl is offline   Reply With Quote
Old 11-13-2013, 05:25 AM   #23
Eveleee
Junior Member
 
Location: Czech Republic

Join Date: Nov 2013
Posts: 2
Default Transcriptom annotation

Hello to everyone,
I would like to ask for advice wich tools to use to annotate assembled RNA seq. We have sequenced several species at Illumina HiSeq and want to annotate genes now. Then see the differences in expression, because we have different species and hybrids as well.
Does anyone can advice me the pipeline? I have heard about RAST and then to manually approved it in GMOD.
I see majority is working with blast2go, but it is not free
Eveleee is offline   Reply With Quote
Old 11-15-2013, 03:54 AM   #24
Simon Anders
Senior Member
 
Location: Heidelberg, Germany

Join Date: Feb 2010
Posts: 991
Default

Quote:
Originally Posted by bbl View Post
How does HTseq-count count paired-end reads which are mapped on different chromosomes or genes? Or should they be filtered out prior htseq-count run?
They are counted either as "ambiguous" (default mode "union") or as "no feature" ("intersection" modes).
Simon Anders is offline   Reply With Quote
Old 11-15-2013, 06:53 AM   #25
bbl
Member
 
Location: London

Join Date: Jul 2010
Posts: 16
Default

Thanks Simon. Does it mean I can understand that I only need to filter multiple mapped reads in the accepted_hits.bam from tophat2? No other filtering step necessary prior to HTseq-count?

Quote:
Originally Posted by Simon Anders View Post
They are counted either as "ambiguous" (default mode "union") or as "no feature" ("intersection" modes).
bbl is offline   Reply With Quote
Old 11-15-2013, 07:17 AM   #26
Simon Anders
Senior Member
 
Location: Heidelberg, Germany

Join Date: Feb 2010
Posts: 991
Default

If TopHat marks the multimapping reads with the "NH" tag (and, IIRC, it does), htseq-count filters these out for you, too.
Simon Anders is offline   Reply With Quote
Old 03-02-2014, 10:05 PM   #27
kcm.eid
Junior Member
 
Location: India

Join Date: Oct 2012
Posts: 2
Default Sorting SAM file properly can fix the warning

Hi,
I came across the same issue while using htseq-count.
I ve used
Code:
sort -n 2234_accepted_hits.sam > 2234_accepted_hits_sorted.sam
command to sort my accepted_hits.sam, but still getting the warning message. When I little analysed the warning message and my input sorted SAM file, found that my sam file was not sorted properly. One read which is showing warning is as follows:

ERR032234.10311751 147 chr12 100208754 50 76M = 100208593 -237 GGCAGCCCAGGGCTTGGTGTTGGCAGTTTGAGTTCGTGAGATAGAAAGAGTGGGGTGTCAAGGCAGTACCCCTGAG >>[email protected]=<9;9?>A<=;A;7>5;;;;[email protected]=<[email protected]:;5AA;[email protected]@<<<;<;=8=5;@;8;<?.55>7?7> XA:i:0 MD:Z:76 NM:i:0 NH:i:1

ERR032234.1031175 145 chr1 51757180 50 76M = 51755662 -1594 CTCGGACTTGATCTGCCCAGACTTTTGGTCAGCAAGGAGAAGGTTATTGTTTGTTAAGAGGAAAATCCGAGATGTA A3=>D<D??>[email protected]?9DDD>DDBBDDDDDDDDDBDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD XA:i:0 MD:Z:76 NM:i:0 NH:i:1

ERR032234.10311751 99 chr12 100208593 50 76M = 100208754 237 CTAAAAGCTTACCTCCAAAACAGGATTCTTGTGTAACTAGGAATCCTGCATGAGAACCAGAAACCCTAACCTCCGA [email protected]?AC8>ACC>AA<C?8B=6B<98?=>BC<C9=AAA<CA>A=A:=BB=>>::<>@[email protected]*2:;8:7><<;7<><; XA:i:0 MD:Z:76 NM:i:0 NH:i:1


So the reads were not sorted correctly. I used the instruction from here

Code:
sort -bn -k 1.11,1 2234_accepted_hits.sam > 2234_accepted_hits_sorted.sam
my reads have ids like this: ERR032234.1031175
this sort command would sort the file on field 1, from chars 11 to end of field, numerically.


now htseq-count working fine.

Last edited by kcm.eid; 03-02-2014 at 10:07 PM. Reason: correcting instructions
kcm.eid is offline   Reply With Quote
Old 02-11-2015, 12:02 PM   #28
raphael123
Member
 
Location: Mc Gill -- Montreal

Join Date: Dec 2013
Posts: 37
Default

A possible problem :

According to htseq documentation : http://www-huber.embl.de/users/ander.../counting.html

"The fact that the records describe the same fragment can be seen from the fact that they have the same read name"

So Tophat for instance is ouputing read paris with different names. (Adding 1 or 2 at the end of the name)

So simply do that:

(samtools view -H in.bam; in.bam | awk '{print substr($1,1,length($1)-1),$0}' | sed 's/ [^ ]*//') | samtools view -bSh - > in.samereadname.bam
raphael123 is offline   Reply With Quote
Reply

Tags
bowtie2, htseq-count, tophat, warning message

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:34 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO