Hi!
I'm having problems with my reference genome using Tophat to generate FPKM values from RNA-seq.
TopHat or Cufflink allows me to use an GTF/GFF reference file to link the results to my transcripts.
The problem is that i just have a genbank's format file and when i translate with genbank2gff.pl and use it on tophat i have this error:
"GFF Error at Dbxref (-): exon 7633-7818 (+) found on different strand; discarded."(repeated for 83 exons).
The original code (genbank) for this position is:
gene 7633..7818
/gene="psbK"
CDS 7633..7818
/gene="psbK"
/codon_start=1
/transl_table=11
/product="photosystem II protein K"
/protein_id="AEB72208.1"
/db_xref="GI:329124652"
/translation="MLNTFSLIGICLNSTLYSSSFFFGKLPEAYAFLNPIVDIMPVIP
LFFFLLAFVWQAAVSFR"
The GFF3 code generated gives me more than one line for this location:
"trnQ-UUG" ; product "tRNA-Gln"
JF772170 GenBank exon 7216 7287 . - . Name "trnQ-UUG" ; Parent "trnQ-UUG.r01"
JF772170 GenBank gene 7633 7818 . + . ID psbK ; Name psbK
JF772170 GenBank mRNA 7633 7818 . + . ID "psbK.t01" ; Parent psbK
JF772170 GenBank CDS 7633 7818 . + . Dbxref "GI:329124652" ; ID "psbK.p01" ; Name psbK ; Parent "psbK.t01" ; codon_start 1 ; product "photosystem II protein K" ; protein_id "AEB72208.1" ; transl_table 11 ; translation "length.61"
JF772170 GenBank exon 7633 7818 . + . Parent "psbK.t01"
JF772170 GenBank gene 8180 8290 . + . ID psbI ; Name psbI
JF772170 GenBank mRNA 8180 8290 . + . ID "psbI.t01" ; Parent psbI
JF772170 GenBank CDS 8180 8290 . + . Dbxref "GI:329124653" ; ID "psbI.p01" ; Name
Any ideas or alternatives?
Thank you
I'm having problems with my reference genome using Tophat to generate FPKM values from RNA-seq.
TopHat or Cufflink allows me to use an GTF/GFF reference file to link the results to my transcripts.
The problem is that i just have a genbank's format file and when i translate with genbank2gff.pl and use it on tophat i have this error:
"GFF Error at Dbxref (-): exon 7633-7818 (+) found on different strand; discarded."(repeated for 83 exons).
The original code (genbank) for this position is:
gene 7633..7818
/gene="psbK"
CDS 7633..7818
/gene="psbK"
/codon_start=1
/transl_table=11
/product="photosystem II protein K"
/protein_id="AEB72208.1"
/db_xref="GI:329124652"
/translation="MLNTFSLIGICLNSTLYSSSFFFGKLPEAYAFLNPIVDIMPVIP
LFFFLLAFVWQAAVSFR"
The GFF3 code generated gives me more than one line for this location:
"trnQ-UUG" ; product "tRNA-Gln"
JF772170 GenBank exon 7216 7287 . - . Name "trnQ-UUG" ; Parent "trnQ-UUG.r01"
JF772170 GenBank gene 7633 7818 . + . ID psbK ; Name psbK
JF772170 GenBank mRNA 7633 7818 . + . ID "psbK.t01" ; Parent psbK
JF772170 GenBank CDS 7633 7818 . + . Dbxref "GI:329124652" ; ID "psbK.p01" ; Name psbK ; Parent "psbK.t01" ; codon_start 1 ; product "photosystem II protein K" ; protein_id "AEB72208.1" ; transl_table 11 ; translation "length.61"
JF772170 GenBank exon 7633 7818 . + . Parent "psbK.t01"
JF772170 GenBank gene 8180 8290 . + . ID psbI ; Name psbI
JF772170 GenBank mRNA 8180 8290 . + . ID "psbI.t01" ; Parent psbI
JF772170 GenBank CDS 8180 8290 . + . Dbxref "GI:329124653" ; ID "psbI.p01" ; Name
Any ideas or alternatives?
Thank you
Comment