Hi,
I've searched the forum for a problem similar to mine, and there is a thread dealing with, but it's form two years ago and sometimes it's hard to get help on resurrected threads.
So, I have my RNAseq data back, I've aligned it with Bowtie2. I've sorted and coverted the sam files into sorted bam files with samtools. I've now trying to use HTseq for couting reads, but I get a parsing error.
Specifically, I've running this command
and get the error
For reference, these are the first 25 lines from the gff3 file:
I've not entirely sure what part of the line is causing the parsing to fail. It says it's on line 15, which is the first non-commented line, but it's a contig. Wouldn't HTseq skip this and go for the lines with exons?
I've searched the forum for a problem similar to mine, and there is a thread dealing with, but it's form two years ago and sometimes it's hard to get help on resurrected threads.
So, I have my RNAseq data back, I've aligned it with Bowtie2. I've sorted and coverted the sam files into sorted bam files with samtools. I've now trying to use HTseq for couting reads, but I get a parsing error.
Specifically, I've running this command
Code:
samtools view 11A_align_sort.bam | htseq-count -s no -i ID - ~/aedes_genomic/Aedes-aegypti-Liverpool_BASEFEATURES.gff3 > 11A_count
Code:
Error occured in line 15 of file /home/emiliano/aedes_genomic/Aedes-aegypti-Liverpool_BASEFEATURES.gff3. Error: Failure parsing GFF attribute line [Exception type: ValueError, raised in __init__.py:171]
Code:
##gff-version 3 ##feature-ontology so.obo ##attribute-ontology gff3_attributes.obo # # Dumped from database. # # using SO:0000704 for VectorBase Gene # using SO:0000234 for VectorBase Transcript # using SO:0000147 for VectorBase Exon # using SO:0000316 for VectorBase CDS # using SO:0000204 for VectorBase 5'UTR # using SO:0000205 for VectorBase 3'UTR # ##sequence-region supercontig:AaegL1:supercont1.1:1:5856339:1 supercont1.1 VectorBase contig 1 5856339 . . . ID=supercont1.1;molecule_type=dsDNA;GenBank:supercontig:AaegL1:supercont1.1:1:5856339:1;translation_table=1;topology=linear;localization=chromosomal; supercont1.1 VectorBase gene 35414 53420 . + . ID=AAEL000064; supercont1.1 VectorBase mRNA 35414 53420 . + . ID=AAEL000064-RA;Parent=AAEL000064;Dbxref=UniProtKB:Q8T4S1,UniProtKB:Q8T4S2,UniProtKB:Q8T4S3,UniProtKB:Q8T4S4,UniProtKB:Q8T4S5,UniProtKB:Q8T4S6,UniProtKB:Q9GSZ4,GenBank:AF288384,GenBank:AY064094,GenBank:AY064095,GenBank:AY064096,GenBank:AY064097,GenBank:AY064098,GenBank:AY064099,GenBank:CH477186,protein_id:AAG01014,protein_id:AAL85595,protein_id:AAL85596,protein_id:AAL85597,protein_id:AAL85598,protein_id:AAL85599,protein_id:AAL85600,protein_id:EAT48898,UniParc:UPI000007F997;description=hypothetical protein; supercont1.1 VectorBase exon 35414 35644 . + . ID=E036654A;Parent=AAEL000064-RA; supercont1.1 VectorBase exon 35699 35901 . + . ID=E036655A;Parent=AAEL000064-RA; supercont1.1 VectorBase exon 35993 36098 . + . ID=E036656A;Parent=AAEL000064-RA; supercont1.1 VectorBase exon 52193 52851 . + . ID=E036657A;Parent=AAEL000064-RA; supercont1.1 VectorBase exon 52973 53420 . + . ID=E036658A;Parent=AAEL000064-RA; supercont1.1 VectorBase five_prime_utr 35414 35523 . + . Parent=AAEL000064-RA supercont1.1 VectorBase CDS 35524 35644 . + 0 Parent=AAEL000064-RA; supercont1.1 VectorBase CDS 35699 35901 . + 2 Parent=AAEL000064-RA;
Comment