Hello,
I am doing differential expression analysis on an insect bacterial endosymbiont. I have Illumina HiSeq reads that are paired-end and strand-specific.
I have mapped my reads to a preliminary genome assembly for the endosymbiont with bbmap, and am trying to summarize the results with featureCounts. I annotated the preliminary genome with RAST, and am having difficulty using the exported .gtf file from RAST in featureCounts.
This is a sample annotation from RAST:
tig00000001_pilon FIG CDS 734 1288 . + 2 ID=fig|707884.3.peg.1;Name=Mobile element protein
I am interested in summarizing at the MetaFeature level, since there is really no difference between genes and exons in this case. I used the below command (knowing that the Name field probably wasn't what featureCounts was looking for, since it doesn't represent a unique identifier).
featureCounts(files=c("file1.bam","file2.bam","file3.bam","file4.bam","file5.bam","file6.bam","file7.bam","file8.bam"),strandSpecific=2,isPairedEnd=TRUE,requireBothEndsMapped=TRUE,isGTFAnnotationFile=TRUE,GTF.attrType="ID",GTF.featureType="Name",useMetaFeatures=TRUE,annot.ext="/path/to/file/file.gtf")
I got this error:
|| WARNING no features were loaded in format GTF. ||
|| annotation format can be specified using '-F'. ||
Failed to open the annotation file /path/to/file/file.gtf, or its format is incorrect, or it contains no 'exon' features.
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
cannot open file './.Rsubread_featureCounts_pid40183': No such file or directory
I then tried to edit the .gtf file to comply a bit better with the format specifications that I saw described (giving a unique identifier to 'exon', etc.):
tig00000001_pilon FIG CDS 734 1288 . + 2 gene_id 707884.3.peg.1; exon 707884.3.peg.1.1
I ran the below command, but I got the exact same error:
featureCounts(files=c("file1.bam","file2.bam","file3.bam","file4.bam","file5.bam","file6.bam","file7.bam","file8.bam"),strandSpecific=2,isPairedEnd=TRUE,requireBothEndsMapped=TRUE,isGTFAnnotationFile=TRUE,GTF.attrType="gene_id",GTF.featureType="exon",useMetaFeatures=TRUE,annot.ext="/path/to/file/file.gtf")
Is there something fundamentally wrong that I'm doing here?
Thanks!
Ryan
I am doing differential expression analysis on an insect bacterial endosymbiont. I have Illumina HiSeq reads that are paired-end and strand-specific.
I have mapped my reads to a preliminary genome assembly for the endosymbiont with bbmap, and am trying to summarize the results with featureCounts. I annotated the preliminary genome with RAST, and am having difficulty using the exported .gtf file from RAST in featureCounts.
This is a sample annotation from RAST:
tig00000001_pilon FIG CDS 734 1288 . + 2 ID=fig|707884.3.peg.1;Name=Mobile element protein
I am interested in summarizing at the MetaFeature level, since there is really no difference between genes and exons in this case. I used the below command (knowing that the Name field probably wasn't what featureCounts was looking for, since it doesn't represent a unique identifier).
featureCounts(files=c("file1.bam","file2.bam","file3.bam","file4.bam","file5.bam","file6.bam","file7.bam","file8.bam"),strandSpecific=2,isPairedEnd=TRUE,requireBothEndsMapped=TRUE,isGTFAnnotationFile=TRUE,GTF.attrType="ID",GTF.featureType="Name",useMetaFeatures=TRUE,annot.ext="/path/to/file/file.gtf")
I got this error:
|| WARNING no features were loaded in format GTF. ||
|| annotation format can be specified using '-F'. ||
Failed to open the annotation file /path/to/file/file.gtf, or its format is incorrect, or it contains no 'exon' features.
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
cannot open file './.Rsubread_featureCounts_pid40183': No such file or directory
I then tried to edit the .gtf file to comply a bit better with the format specifications that I saw described (giving a unique identifier to 'exon', etc.):
tig00000001_pilon FIG CDS 734 1288 . + 2 gene_id 707884.3.peg.1; exon 707884.3.peg.1.1
I ran the below command, but I got the exact same error:
featureCounts(files=c("file1.bam","file2.bam","file3.bam","file4.bam","file5.bam","file6.bam","file7.bam","file8.bam"),strandSpecific=2,isPairedEnd=TRUE,requireBothEndsMapped=TRUE,isGTFAnnotationFile=TRUE,GTF.attrType="gene_id",GTF.featureType="exon",useMetaFeatures=TRUE,annot.ext="/path/to/file/file.gtf")
Is there something fundamentally wrong that I'm doing here?
Thanks!
Ryan
Comment