Dear all,
I would like to have our opinion on how to deal with technical replicates in a mRNA-seq experiment for DGE analysis. I saw multiple posts on these topics but with few insights about induced biases. Here is my biological problem : I have to analyze mRNA-seq libraries (paired-end) in a case-control study and want to identify differentially expressed genes. In this experiment, we have multiple biological replicates (these ones are ok for me) but we have also for some samples (not for all) technical replicates. By technical, I mean that libraries were prepared once but were sequenced several times, either onto multiple lanes on the same flowcell or onto multiple flowcells. Moreover, all librairies were multiplexed. Up to now, it has been decided to combine all technical replicates from the same sample into one single FASTQ file (actually one for each end) and then to map these files onto the genome. But I suspect that it would be good to investigate sequencing depth for each technical replicate prior merging and maybe to discard the replicates with abnormally low sequencing depth. I indeed think that merging for instance one replicate of 5 000 000 reads with one replicate of 20 000 000 reads would lead to a biased composition in the final library (increasing the proportion of highly abundant transcripts) and that I may have issues for the DGE analysis. Or should we consider separetely all technical replicates until the counting steps, or even after ?
I would be very happy to have your feedbacks.
Thank you very much,
Claudia
I would like to have our opinion on how to deal with technical replicates in a mRNA-seq experiment for DGE analysis. I saw multiple posts on these topics but with few insights about induced biases. Here is my biological problem : I have to analyze mRNA-seq libraries (paired-end) in a case-control study and want to identify differentially expressed genes. In this experiment, we have multiple biological replicates (these ones are ok for me) but we have also for some samples (not for all) technical replicates. By technical, I mean that libraries were prepared once but were sequenced several times, either onto multiple lanes on the same flowcell or onto multiple flowcells. Moreover, all librairies were multiplexed. Up to now, it has been decided to combine all technical replicates from the same sample into one single FASTQ file (actually one for each end) and then to map these files onto the genome. But I suspect that it would be good to investigate sequencing depth for each technical replicate prior merging and maybe to discard the replicates with abnormally low sequencing depth. I indeed think that merging for instance one replicate of 5 000 000 reads with one replicate of 20 000 000 reads would lead to a biased composition in the final library (increasing the proportion of highly abundant transcripts) and that I may have issues for the DGE analysis. Or should we consider separetely all technical replicates until the counting steps, or even after ?
I would be very happy to have your feedbacks.
Thank you very much,
Claudia
Comment