Hi all,
Previously, I am using HTSeq-count to get gene expression levels for my RNA-seq data. It works perfectly for me.
Nevertheless, in several cases, I noticed that some genes have 0 read count, but in IGV they have lots of reads mapped. I known this could be due to the mis-annotation of the GTF files.
I am planing to write my own perl script to count gene expression levels from sam file, and to compare the results between my-count and HTSeq-count.
However, I have a problem to define the end mapping position of the read in sam file.
This information is in the CIGAR column, right?
Therefore, the end = Start + (no. of M) + (no. of D) + (no. of N). Am I right?
Many thanks.
Best,
Jerry
Previously, I am using HTSeq-count to get gene expression levels for my RNA-seq data. It works perfectly for me.
Nevertheless, in several cases, I noticed that some genes have 0 read count, but in IGV they have lots of reads mapped. I known this could be due to the mis-annotation of the GTF files.
I am planing to write my own perl script to count gene expression levels from sam file, and to compare the results between my-count and HTSeq-count.
However, I have a problem to define the end mapping position of the read in sam file.
This information is in the CIGAR column, right?
Therefore, the end = Start + (no. of M) + (no. of D) + (no. of N). Am I right?
Many thanks.
Best,
Jerry
Comment