Seqanswers Leaderboard Ad

**dpryan** · 08-08-2013, 07:06 AM

You might use the "-o" option to try to debug things.

**jparsons** · 08-08-2013, 08:48 AM

I'd be looking at the various -s options. Although you do seem to be aware that they exist and that your run is stranded, is reverse the correct orientation?

**moser** · 08-09-2013, 03:05 AM

Hi, thank you for the fast response. i already played with the -s option which doesnt change the output and i am pretty sure "reverse" is the correct setting for stranded truseq illumina rna-seq reads.

When using the -o option, the sam file is only containing multimapped reads like:

OBIWAN:33

276GACXX:8:2311:4833:66136 161 bac2:383-131865 65536 3 51M = 65923 3264 AATGAGCAACCAGCAGCCACAACCCTGCATGGGGTAGTTCTTCAAGTGCCA @CCFFFFFHHHHHJJIJJJJIJJJJIIJGIJJJJ?FHGHIIIJJJJGIJIG AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:51 YT:Z:UU NH:i:2 CC:Z:= CP:i:65536 HI:i:0 XF:Z:alignment_not_unique

BUT when i look at the orginial sam file to do the counting with, i find unique aligned reads on the same feature (from position 65171 to 70109 of bac2).
Unfortunately they dont get reported.

OBIWAN:33

276GACXX:8:2111:9231:51465 163 bac2:383-131865 65541 50 51M = 65637 147 GCAACCAGCAGCCACAACCCTGCATGGGGTAGTTCTTCAAGTGCCAGTGAT CCCFFFFFHHHHHJJJJJJJJJJJJJJJJCGIHIJIJJJJJHIJJJJFHII AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:51 YT:Z:UU NH:i:1

By the way, do you know what the = sign in column 7 means? Should this be the strand information?

best, Michel

**dpryan** · 08-09-2013, 03:20 AM

It seems very odd that the unique alignments aren't output to the file specified with -o. Maybe try making a miniature SAM file with just a couple uniquely aligned pairs that you know should be counted in transcripts and then rerun htseq-count (optionally with a miniature annotation file). BTW, what version of htseq-count and python are you using?

Regarding the = sign, that just means that the mate is mapped to the same chromosome/contig. If the mates are mapped to different contigs, then that field will contain the name of the other contig.

**moser** · 08-09-2013, 05:30 AM

i am using htseq-count version 0.5.4p3 and Python 2.7.3

**dpryan** · 08-09-2013, 06:40 AM

Originally posted by moser View Post

i am using htseq-count version 0.5.4p3 and Python 2.7.3

I'm unable to reproduce this issue with the same versions of htseq-count and python. You might try making a SAM file with just a few reads and an annotation with just a single gene to which the reads should align and rerun htseq-count. If the reads are missing from the file specified by -o, then perhaps just post things here.

**aggp11** · 08-09-2013, 08:28 AM

Sorry, if this seems too simple to be an issue, but did you check that the location of your features match in both the sam and GTF files?

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 39 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 41 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 35 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

htseq-count only not-unique counts

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News