Hi everyone,
I used cuffmerge to obtain a merged.gtf file but there are 3700 entries in my merged.gtf file that contain '.' in the strand field instead of '+' or '-'. I checked the entries that do not have a strand information and each of these have a class-code "u" which stands for unknown.
I want to use this merged.gtf file in my htseq-count command to get the counts (I eventually plan to do a differential gene expression analysis). When I use htseq-count using merged.gtf I get the following error:
I know the error is due to missing strand information. I have two options here:
1) Use --stranded=no
2) Remove all the entries corresponding to missing strand information
What do you guys suggest I do? Do you think removing entries with missing information is better than using --stranded=no? I am worried that using --stranded=no will affect the resulting counts, right? Are there any other suggestions?
FYI, I have posted this question on Biostars too:http://www.biostars.org/p/87578/
I used cuffmerge to obtain a merged.gtf file but there are 3700 entries in my merged.gtf file that contain '.' in the strand field instead of '+' or '-'. I checked the entries that do not have a strand information and each of these have a class-code "u" which stands for unknown.
I want to use this merged.gtf file in my htseq-count command to get the counts (I eventually plan to do a differential gene expression analysis). When I use htseq-count using merged.gtf I get the following error:
Error occured when processing GFF file (line 360833 of file ./merged_asm/merged.gtf):
Feature XLOC_003190 at chr1:[1285003,1285358)/. does not have strand information but you are running htseq-count in stranded mode. Use '--stranded=no'.
[Exception type: SystemExit, raised in count.py:59]
Feature XLOC_003190 at chr1:[1285003,1285358)/. does not have strand information but you are running htseq-count in stranded mode. Use '--stranded=no'.
[Exception type: SystemExit, raised in count.py:59]
1) Use --stranded=no
2) Remove all the entries corresponding to missing strand information
What do you guys suggest I do? Do you think removing entries with missing information is better than using --stranded=no? I am worried that using --stranded=no will affect the resulting counts, right? Are there any other suggestions?
FYI, I have posted this question on Biostars too:http://www.biostars.org/p/87578/
Comment