Dear community,
I have this Mus Musculus pair-end RNA-Seq disease sample of length 75bp. I mapped it onto Mus Musculus genome (NCBI Build 37.2) using Tophat, and I have found that the percentage of reads with at least one mapping is extremely low, 25%.
On the other hand, the control sample has 80% mapping percentage.
I have checked for possible contamination by mapping the samples with human genome, mycoplasma genome, and E.Coli genome, and I am pretty sure that possibility of contamination can be ruled out because the percentage mapping to these genomes are less than 1%.
My question: What are the common causes for extremely low percentage of mapping? Errornous base-calling? High error rate in sequencing steps? or Tophat is not good enough? Or can I conclude this as a new finding with further validation?
Tell me if you need more information. =)
I have this Mus Musculus pair-end RNA-Seq disease sample of length 75bp. I mapped it onto Mus Musculus genome (NCBI Build 37.2) using Tophat, and I have found that the percentage of reads with at least one mapping is extremely low, 25%.
On the other hand, the control sample has 80% mapping percentage.
I have checked for possible contamination by mapping the samples with human genome, mycoplasma genome, and E.Coli genome, and I am pretty sure that possibility of contamination can be ruled out because the percentage mapping to these genomes are less than 1%.
My question: What are the common causes for extremely low percentage of mapping? Errornous base-calling? High error rate in sequencing steps? or Tophat is not good enough? Or can I conclude this as a new finding with further validation?
Tell me if you need more information. =)
Comment