Hello,
I am trying to use htseq-count as follows:
htseq-count bin2_s_7_bwa.sam Mycobacterium_bovis_BCG_Pasteur_1173P2.gff
But I get the following error:
Error occured in line 10 of file /mnt/ScratchPool/rouamsl/TB_sequencing/ReadCount/Mycobacterium_bovis_BCG_Pasteur_1173P2.gff.
Error: The attribute string seems to contain mismatched quotes.
[Exception type: ValueError, raised in __init__.py:168]
My .gff file looks like this:
##gff-version 3
#!gff-spec-version 1.20
#!processor NCBI annotwriter
##sequence-region NC_008769.1 1 4374522
##species http://www.ncbi.nlm.nih.gov/Taxonomy....cgi?id=410289
##sequence-region NC_008769.1 1 4374522
##species http://www.ncbi.nlm.nih.gov/Taxonomy....cgi?id=410289
NC_008769.1 RefSeq region 1 4374522 . + . ID=id0;Dbxref="taxon:410289";Is_circular=true;gbkey=Src;genome=chromosome;mol_type="genomic DNA";strain="BCG Pasteur 1173P2"
NC_008769.1 RefSeq gene 1 1524 . + . ID=gene0;Name=dnaA;Dbxref="GeneID:4697358";gbkey=Gene;gene=dnaA;locus_tag=BCG_0001
NC_008769.1 RefSeq CDS 1 1524 . + 0 ID=cds0;Name=YP_976107.1;Parent=gene0;Note="binds to the dnaA-box as an ATP-bound complex at the origin of replication during the initiation of chromosomal replication%3B can also affect transcription of multiple genes including itself.";Dbxref="GeneID:4697358","GI:121635884";gbkey=CDS;product=YP_976107.1;protein_id=YP_976107.1;transl_table=11
I found a similar post on the forum at
But it was not solved! And it was using gtf files (and not gff).
I downloaded my .gff file as follows (from the NCBI website):
wget --timestamping 'ftp://ftp.ncbi.nih.gov/genomes/Bacteria/Mycobacterium_bovis_BCG_Pasteur_1173P2_uid58781/NC_008769.gff' -O Mycobacterium_bovis_BCG_Pasteur_1173P2.gff
Thank you for your help!
Regards,
S
I am trying to use htseq-count as follows:
htseq-count bin2_s_7_bwa.sam Mycobacterium_bovis_BCG_Pasteur_1173P2.gff
But I get the following error:
Error occured in line 10 of file /mnt/ScratchPool/rouamsl/TB_sequencing/ReadCount/Mycobacterium_bovis_BCG_Pasteur_1173P2.gff.
Error: The attribute string seems to contain mismatched quotes.
[Exception type: ValueError, raised in __init__.py:168]
My .gff file looks like this:
##gff-version 3
#!gff-spec-version 1.20
#!processor NCBI annotwriter
##sequence-region NC_008769.1 1 4374522
##species http://www.ncbi.nlm.nih.gov/Taxonomy....cgi?id=410289
##sequence-region NC_008769.1 1 4374522
##species http://www.ncbi.nlm.nih.gov/Taxonomy....cgi?id=410289
NC_008769.1 RefSeq region 1 4374522 . + . ID=id0;Dbxref="taxon:410289";Is_circular=true;gbkey=Src;genome=chromosome;mol_type="genomic DNA";strain="BCG Pasteur 1173P2"
NC_008769.1 RefSeq gene 1 1524 . + . ID=gene0;Name=dnaA;Dbxref="GeneID:4697358";gbkey=Gene;gene=dnaA;locus_tag=BCG_0001
NC_008769.1 RefSeq CDS 1 1524 . + 0 ID=cds0;Name=YP_976107.1;Parent=gene0;Note="binds to the dnaA-box as an ATP-bound complex at the origin of replication during the initiation of chromosomal replication%3B can also affect transcription of multiple genes including itself.";Dbxref="GeneID:4697358","GI:121635884";gbkey=CDS;product=YP_976107.1;protein_id=YP_976107.1;transl_table=11
I found a similar post on the forum at
But it was not solved! And it was using gtf files (and not gff).
I downloaded my .gff file as follows (from the NCBI website):
wget --timestamping 'ftp://ftp.ncbi.nih.gov/genomes/Bacteria/Mycobacterium_bovis_BCG_Pasteur_1173P2_uid58781/NC_008769.gff' -O Mycobacterium_bovis_BCG_Pasteur_1173P2.gff
Thank you for your help!
Regards,
S