Removing non-overlapping paired-end reads is the default option in MG-RAST. I was just wondering, did anyone compare the results when (1) removing and (2) retaining such reads?
For one of our metagenomes, removing the non-overlapping paired-ends leads to a 4.5 GB file ( average read length 169.321 bp). Meanwhile, retaining these reads leads to a 28.2 GB file (average read length 108.322 bp). Does such a big difference indicate problems with sequencing (i.e. why so few paired-end reads overlap)?
For one of our metagenomes, removing the non-overlapping paired-ends leads to a 4.5 GB file ( average read length 169.321 bp). Meanwhile, retaining these reads leads to a 28.2 GB file (average read length 108.322 bp). Does such a big difference indicate problems with sequencing (i.e. why so few paired-end reads overlap)?