SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
htseq-count paolo.kunder Bioinformatics 10 10-22-2014 04:45 AM
Which ID should be used for HTSeq-count? syintel87 Bioinformatics 11 02-07-2013 12:16 AM
HTseq count problem dietmar13 Bioinformatics 0 12-19-2012 02:34 AM
multiBamCov or htseq-count to count read per feature ? NicoBxl Bioinformatics 1 07-03-2012 02:05 AM
Problem using HTSeq count with SAM file without quality score flashton Bioinformatics 2 04-11-2012 03:29 AM

Reply
 
Thread Tools
Old 08-07-2013, 11:38 PM   #1
anikng
Member
 
Location: Seoul

Join Date: Aug 2013
Posts: 14
Default Problem with HTSeq-count

Hi All,
I have a problem in counting reads using HTSe-count. The error is like,

csg@csg-W650EH:~/Downloads$ python -m HTSeq.scripts.count Nipponbare_ref_assembly.sam chr1.gff3
Error occured in line 3 of file chr1.gff3.
Error: Feature LOC_Os01g01010.1:exon_1 does not contain a 'gene_id' attribute
[Exception type: SystemExit, raised in count.py:55]
csg@csg-W650EH:~/Downloads$

Following is a portion of the gff3 file,
##gff-version 3
Chr1 MSU_osa1r7 mRNA 2903 10817 . + . ID=LOC_Os01g01010.1;Name=LOC_Os01g01010.1;Parent=LOC_Os01g01010
Chr1 MSU_osa1r7 exon 2903 3268 . + . ID=LOC_Os01g01010.1:exon_1;Parent=LOC_Os01g01010.1
Chr1 MSU_osa1r7 exon 3354 3616 . + . ID=LOC_Os01g01010.1:exon_2;Parent=LOC_Os01g01010.1
Chr1 MSU_osa1r7 exon 4357 4455 . + . ID=LOC_Os01g01010.1:exon_3;Parent=LOC_Os01g01010.1
Chr1 MSU_osa1r7 exon 5457 5560 . + . ID=LOC_Os01g01010.1:exon_4;Parent=LOC_Os01g01010.1
Chr1 MSU_osa1r7 exon 7136 7944 . + . ID=LOC_Os01g01010.1:exon_5;Parent=LOC_Os01g01010.1
Chr1 MSU_osa1r7 exon 8028 8150 . + . ID=LOC_Os01g01010.1:exon_6;Parent=LOC_Os01g01010.1
Chr1 MSU_osa1r7 exon 8232 8320 . + . ID=LOC_Os01g01010.1:exon_7;Parent=LOC_Os01g01010.1
Chr1 MSU_osa1r7 exon 8408 8608 . + . ID=LOC_Os01g01010.1:exon_8;Parent=LOC_Os01g01010.1
Chr1 MSU_osa1r7 exon 9210 9617 . + . ID=LOC_Os01g01010.1:exon_9;Parent=LOC_Os01g01010.1
Chr1 MSU_osa1r7 exon 10104 10187 . + . ID=LOC_Os01g01010.1:exon_10;Parent=LOC_Os01g01010.1
Chr1 MSU_osa1r7 exon 10274 10430 . + . ID=LOC_Os01g01010.1:exon_11;Parent=LOC_Os01g01010.1
Chr1 MSU_osa1r7 exon 10504 10817 . + . ID=LOC_Os01g01010.1:exon_12;Parent=LOC_Os01g01010.1
Chr1 MSU_osa1r7 five_prime_UTR 2903 3268 . + . ID=LOC_Os01g01010.1:utr_1;Parent=LOC_Os01g01010.1
Chr1 MSU_osa1r7 five_prime_UTR 3354 3448 . + . ID=LOC_Os01g01010.1:utr_2;Parent=LOC_Os01g01010.1
Chr1 MSU_osa1r7 CDS 3449 3616 . + . ID=LOC_Os01g01010.1:cds_1;Parent=LOC_Os01g01010.1
Chr1 MSU_osa1r7 CDS 4357 4455 . + . ID=LOC_Os01g01010.1:cds_2;Parent=LOC_Os01g01010.1
Chr1 MSU_osa1r7 CDS 5457 5560 . + . ID=LOC_Os01g01010.1:cds_3;Parent=LOC_Os01g01010.1
Chr1 MSU_osa1r7 CDS 7136 7944 . + . ID=LOC_Os01g01010.1:cds_4;Parent=LOC_Os01g01010.1
Chr1 MSU_osa1r7 CDS 8028 8150 . + . ID=LOC_Os01g01010.1:cds_5;Parent=LOC_Os01g01010.1
Chr1 MSU_osa1r7 CDS 8232 8320 . + . ID=LOC_Os01g01010.1:cds_6;Parent=LOC_Os01g01010.1
Chr1 MSU_osa1r7 CDS 8408 8608 . + . ID=LOC_Os01g01010.1:cds_7;Parent=LOC_Os01g01010.1
Chr1 MSU_osa1r7 CDS 9210 9617 . + . ID=LOC_Os01g01010.1:cds_8;Parent=LOC_Os01g01010.1
Chr1 MSU_osa1r7 CDS 10104 10187 . + . ID=LOC_Os01g01010.1:cds_9;Parent=LOC_Os01g01010.1
Chr1 MSU_osa1r7 CDS 10274 10297 . + . ID=LOC_Os01g01010.1:cds_10;Parent=LOC_Os01g01010.1
Chr1 MSU_osa1r7 three_prime_UTR 10298 10430 . + . ID=LOC_Os01g01010.1:utr_3;Parent=LOC_Os01g01010.1
Chr1 MSU_osa1r7 three_prime_UTR 10504 10817 . + . ID=LOC_Os01g01010.1:utr_4;Parent=LOC_Os01g01010.1
Chr1 MSU_osa1r7 mRNA 2984 10562 . + . ID=LOC_Os01g01010.2;Name=LOC_Os01g01010.2;Parent=LOC_Os01g01010
Chr1 MSU_osa1r7 exon 2984 3255 . + . ID=LOC_Os01g01010.2:exon_1;Parent=LOC_Os01g01010.2
Chr1 MSU_osa1r7 exon 3354 3616 . + . ID=LOC_Os01g01010.2:exon_2;Parent=LOC_Os01g01010.2
Chr1 MSU_osa1r7 exon 4357 4455 . + . ID=LOC_Os01g01010.2:exon_3;Parent=LOC_Os01g01010.2








Kindly consider that I am a fresher in Rna seq data analysis.

Thank you,
anikng
anikng is offline   Reply With Quote
Old 08-08-2013, 08:50 AM   #2
jparsons
Member
 
Location: SF Bay Area

Join Date: Feb 2012
Posts: 62
Default

htseq-counts default for identity is the gene_id attribute from the gff file. Since your gff file does not have a gene_id attribute, it has nothing to use.

if you change the -i flag to ID (which is how your gff is formatted) it should be happier.
jparsons is offline   Reply With Quote
Old 08-08-2013, 05:05 PM   #3
anikng
Member
 
Location: Seoul

Join Date: Aug 2013
Posts: 14
Default

Thank you jparsons...
As you suggested, i changed the attribute name from 'ID' to 'gene_id' in the gff file and now it is working...

anikng
anikng is offline   Reply With Quote
Old 08-16-2013, 08:33 PM   #4
anikng
Member
 
Location: Seoul

Join Date: Aug 2013
Posts: 14
Default

Hi All,

I have an error while running HTSeq count tool. The error is like "seq' and 'qualstr' do not have the same length." I have seen someone having this error and discussion was going on related to bug issue. But n my case i feel it is different..

csg@csg-W650EH:~/Downloads/bowtie-0.12.8$ python -m HTSeq.scripts.count SEEDLING_ROOT_ORIGINA.sam all.gff3 >> SEDLINGROOT_output.txt

The output of the above command giving the SEDLINGROOT_output.txt file with zero reads for all transcript. As somebody mentioned in this forum, i noticed that the chromosome is indicated as 'Chr*' in gff and something like 'gi...' in sam file. when i changed the 'gi..' id to coresponding Chr id (ie, like in the gff file), the "seq' and 'qualstr' do not have the same length error is displayed.

It is said that error is in line number 19..below is the first few lines of the sam file in which I am getting error..
@HD VN:1.0 SO:unsorted
@SQ SN:gi|297598437|ref|NC_008394.4| LN:45064769
@SQ SN:gi|297600179|ref|NC_008395.2| LN:36823111
@SQ SN:gi|297602023|ref|NC_008396.2| LN:37257345
@SQ SN:gi|297603645|ref|NC_008397.2| LN:35863200
@SQ SN:gi|297605017|ref|NC_008398.2| LN:30039014
@SQ SN:gi|297606578|ref|NC_008399.2| LN:32124789
@SQ SN:gi|297607852|ref|NC_008400.2| LN:30357780
@SQ SN:gi|297609017|ref|NC_008401.2| LN:28530027
@SQ SN:gi|297610002|ref|NC_008402.2| LN:23843360
@SQ SN:gi|297611005|ref|NC_008403.2| LN:23661561
@SQ SN:gi|297612483|ref|NC_008404.2| LN:30828668
@SQ SN:gi|297613623|ref|NC_008405.2| LN:27757321
@PG ID:Bowtie VN:0.12.7 CL:"bowtie -S /home/csg/Downloads/bowtie-0.12.8/indexes/rice /home/csg/Downloads/bowtie-0.12.8/SEDLING_ROOT.fastq"
SEDLING_ROOT.1 0 Chr1 33035177 255 35M * 0 0 GGTTGCTTTTAGAGAAACTTGGACACTTTGTTTAT IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII.I XA:i:1 MD:Z:25T9 NM:i:1
SEDLING_ROOT.2 0 Chr1 29608234 255 35M * 0 0 GGCAACGGATATCTCGGCTCTCGCATCGATGAAGA IIIIIIIIIIIIIIIIIIIIII/I&FI1I5II8'3 XA:i:0 MD:Z:35 NM:i:0
SEDLING_ROOT.3 0 Chr1 29608235 255 35M * 0 0 GCAACGGATATCTCGGCTCTCGCATCGATGAAGAA IIIIIIIIIIIIIIIIIIIIIII;III*.I?I51G XA:i:0 MD:Z:35 NM:i:0
SEDLING_ROOT.4 0 Chr1 29606240 255 35M * 0 0 GTCATATGCTTGTCTCAAAGATTAAGCCATGCATG IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIDIII XA:i:0 MD:Z:35 NM:i:0
SEDLING_ROOT.5 0 Chr1 29608234 255 35M * 0 0 GGCAACGGATATCTCGGCTCTCGCACCGATGAAGA IIIIIIIIIIIIIIIIIIIIHI2F4'0+=,($6@( XA:i:1 MD:Z:25T9 NM:i:1
SEDLING_ROOT.6 16 Chr1 29607225 255 35M * 0 0 ATACCGTCCTAGTCTCAACCATAAACGATGCCGAC I87I<?IIIIIIIIIIIIIIIIIIIIIIIIIIIII XA:i:0 MD:Z:35 NM:i:0
SEDLING_ROOT.7 16 Chr1 29603552 255 35M * 0 0 AAGCTACCGTGTGCCGGATTATGACTGAACGCCTC =6+II8HII2IIIIIICIIIIIIIIIIIIIIIIII XA:i:0 MD:Z:35 NM:i:0
SEDLING_ROOT.8 16 Chr1 29607831 255 35M * 0 0 GCGGTGACTACGTCCCTGCCCTTTGTACACACCGC **.$I@(IIII:IFIIIIIIIIIIIIIIIIIIIII XA:i:0 MD:Z:3T31 NM:i:1
SEDLING_ROOT.9 0 Chr1 29606233 255 35M * 0 0 GCCAGTAGTCATATGCTTGTCTCAAAGATTAAGCC IIIIIIIIIIIIIIIIII4IIIIIIIIIIIIII8I XA:i:0 MD:Z:35 NM:i:0
SEDLING_ROOT.10 HWI-EAS80_2_FC20AWLAAXX:8:1:882:460 length=35 4 * 0 0 * * 0 0 GTGAACTATGCCTGAGCGGGGCGAAGCCAGAGGAA IIIIIIIIIIIIIIIIIIIIIIIIIIII'7A@I)8 XM:i:0
SEDLING_ROOT.11 HWI-EAS80_2_FC20AWLAAXX:8:1:891:382 length=35 4 * 0 0 * * 0 0 GTGTTGGTCGATTAAGACAGCAGGACGGTGGTCCT IIIIIIIIIIIIIIIIIIIIIIIIFFII;IE%$-> XM:i:0
SEDLING_ROOT.12 0 Chr1 29606966 255 35M * 0 0 GTTACTTTGAAGAAATTAGAGTGCTCAAAGCAAGC IIIIIIIIIIIIIIIIIIIIIIIIIIIII4II+,C XA:i:0 MD:Z:35 NM:i:0
SEDLING_ROOT.13 16 Chr1 29608348 255 35M * 0 0 GCGTGCGGGCCGGGGGCACGCCTGCCTGGGCGTCA &(+&+.*1(1$AIII:IIIIIIIIIIIIIIIIIII XA:i:1 MD:Z:2C0A0T1C5A22 NM:i:5
SEDLING_ROOT.14 0 Chr1 29608240 255 35M * 0 0 GGATATCTCGGCTCTCGCATCGATGAAGAAAGTAG IIIIIIIIIIIIIIIIIIIIIIII?IIIII+-)=& XA:i:0 MD:Z:30C4 NM:i:1
SEDLING_ROOT.15 0 Chr1 29608232 255 35M * 0 0 TCGGCAACGGATATCTCGGCTCTCGCATCGATGAA IIIIIIIIIIIIIIIIIIIIII:34I+I2B$II$I XA:i:0 MD:Z:35 NM:i:0
SEDLING_ROOT.16 HWI-EAS80_2_FC20AWLAAXX:8:1:930:891 length=35 4 * 0 0 * * 0 0 GAACTATGCCTGAGCGGGGCGAAGCCAGAGGAAAC IIIIIIIIIIIIIIIIIII8I1II7CII&II*2?1 XM:i:0
SEDLING_ROOT.17 16 Chr1 29606373 255 35M * 0 0 TTCTAGAGCTAATACGTGCAACAAACCCCGACTCC :57BIIII=IIIIIIIIIIIIIIIIIIIIIIIIII XA:i:2 MD:Z:7A25T1 NM:i:2
SEDLING_ROOT.18 16 Chr1 29606730 255 35M * 0 0 GGAATGAGTACAATCTAAATCCCTTAACGAGGATC >;IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII XA:i:0 MD:Z:35 NM:i:0
SEDLING_ROOT.19 16 Chr1 28694491 255 35M * 0 0 CAGCATGTGTAAACTATTTTGCTTATTCACTGATC 2I7GII4IIIIIIIIIIIIIIIIIIIIIIIIIIII XA:i:0 MD:Z:35 NM:i:0


Is this because of the error in editing the sam file? Kindly suggest some solution for this..


Thank you,

anikng
Seoul
anikng is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:30 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO