I have heard that it is important for downstream analyses to retain unmapped reads. I am interested to know the reason for this recommendation.
Specifically, I am using BWA + GATK to call SNPs from Illumina data. It is not clear to me if the GATK SNP calling pipeline ever utilizes unmapped reads. We expect a large proportion of unmapped reads, so we could save a lot of disk space by getting rid of them.
Specifically, I am using BWA + GATK to call SNPs from Illumina data. It is not clear to me if the GATK SNP calling pipeline ever utilizes unmapped reads. We expect a large proportion of unmapped reads, so we could save a lot of disk space by getting rid of them.
Comment