Hi all,
I am trying to analyze the differences in expression levels of genes using RNA-seq. So far, I've aligned the reads to a bacteria reference genome using bowtie2. To summarize reads, I am using HTSeq-count with the aligned reads in sam format as input as well as an annotated object in GFF3 format from NCBI. But, I am facing problems using HTSeq since it seems to be having issues with the GFF3 format that I am using. It complains that gene_id could not be found. After writing a script to convert the GFF3 format to GTF, I get a different error - "Warning: Skipping read 'SNPSTER7_0752:6:120:19747:20850#0/1', because chromosome 'NC_011898.1', to which it has been aligned, did not appear in the GFF file." This happens to every read in my sam file. As a result, I am just wondering if there seems to be another tool that would allow me to perform a similar task as HTSeq-count?
Thank you!
I am trying to analyze the differences in expression levels of genes using RNA-seq. So far, I've aligned the reads to a bacteria reference genome using bowtie2. To summarize reads, I am using HTSeq-count with the aligned reads in sam format as input as well as an annotated object in GFF3 format from NCBI. But, I am facing problems using HTSeq since it seems to be having issues with the GFF3 format that I am using. It complains that gene_id could not be found. After writing a script to convert the GFF3 format to GTF, I get a different error - "Warning: Skipping read 'SNPSTER7_0752:6:120:19747:20850#0/1', because chromosome 'NC_011898.1', to which it has been aligned, did not appear in the GFF file." This happens to every read in my sam file. As a result, I am just wondering if there seems to be another tool that would allow me to perform a similar task as HTSeq-count?
Thank you!
Comment