Hi All,
I am trying to generate count data from the bam files using Htseq. Our data was produced using Illumina stranded library and sequenced using Hiseq 3000. After mapping using Hisat2, I know that there are about 37 million aligned read pairs. However, when I am generating count data using HTseq I can see that it has processed ~ 78 million reads and the total counts from all features is coming around 64 million which I feel is not correct. If we have 37 million aligned reads how can we get double number of count. I have used -s reverse option as the library was prepared using Illumina stranded protocol. I am not able to guess where things are going wrong. It would be great if anyone can give me a probable reason for this.
Best regards, Amit
I am trying to generate count data from the bam files using Htseq. Our data was produced using Illumina stranded library and sequenced using Hiseq 3000. After mapping using Hisat2, I know that there are about 37 million aligned read pairs. However, when I am generating count data using HTseq I can see that it has processed ~ 78 million reads and the total counts from all features is coming around 64 million which I feel is not correct. If we have 37 million aligned reads how can we get double number of count. I have used -s reverse option as the library was prepared using Illumina stranded protocol. I am not able to guess where things are going wrong. It would be great if anyone can give me a probable reason for this.
Best regards, Amit