Hello everyone,
I am using the STAR aligner, which reports discordant pair alignments in a seperate BAM file. Using my workflow I want to search for differentially expressed genes, transcripts and exons.
My question:
- Should I calculate feature counts using only the concordant pairs or include the discordant pairs for counting?
Note that I am examining samples for patients with leukaemia, where gene fusions are common events and have been detected in my samples using other methods. There are around 100000-500000 pairs (out of ~30 million) reported as discordantly aligned.
I am afraid that including discordant pairs will introduce a bias in calculating gene expression. Excluding them, on the other hand, could also introduce a bias. Any and all ideas are welcome! Thanks!
I am using the STAR aligner, which reports discordant pair alignments in a seperate BAM file. Using my workflow I want to search for differentially expressed genes, transcripts and exons.
My question:
- Should I calculate feature counts using only the concordant pairs or include the discordant pairs for counting?
Note that I am examining samples for patients with leukaemia, where gene fusions are common events and have been detected in my samples using other methods. There are around 100000-500000 pairs (out of ~30 million) reported as discordantly aligned.
I am afraid that including discordant pairs will introduce a bias in calculating gene expression. Excluding them, on the other hand, could also introduce a bias. Any and all ideas are welcome! Thanks!