Hello,
I'm working with whole genome resequencing data for a population genomic study (300bp short-insert library and PE 150bp sequencing on a HiSeq X). I will map the reads to the reference genome of the species. I'm using Trimmomatic to remove adapters and low quality bases.
I haven't found any recommandations on the minimum length for discarding short reads. I was thinking discarding either reads that are less than 1/3 (50bp) or half (75bp) of the expected size.
Do you have any advices for this?
In an example, I'm losing 2% additional reads when keeping only the reads of at least 75 bp in comparison with keeping reads of at least 50 bp.
I have attached an example of the FastQC results after running Trimmomatic with the two cut-offs.
Thanks,
Marie
I'm working with whole genome resequencing data for a population genomic study (300bp short-insert library and PE 150bp sequencing on a HiSeq X). I will map the reads to the reference genome of the species. I'm using Trimmomatic to remove adapters and low quality bases.
I haven't found any recommandations on the minimum length for discarding short reads. I was thinking discarding either reads that are less than 1/3 (50bp) or half (75bp) of the expected size.
Do you have any advices for this?
In an example, I'm losing 2% additional reads when keeping only the reads of at least 75 bp in comparison with keeping reads of at least 50 bp.
I have attached an example of the FastQC results after running Trimmomatic with the two cut-offs.
Thanks,
Marie
Comment