Dear All,
These are three different pipelines I've used to call SNPs on the same samples:
1) Raw FastQ File --> Direct Mapping to Reference --> SNP discovered 1,200
2) Raw FastQ File --> De Novo Assembly --> Extract Paired Contigs --> Map Paired Contigs to Reference --> SNPs discovered 1,089
3) Raw FastQ File --> De Novo Assembly --> Extract All Contigs --> Map Contigs to Reference --> SNPs discovered 1,383
I know from lab data that the consensus sequence produced by method number 2 is closest to the consensus sequence of the bacteria in reality that was sequenced. Does this make method 2 (and it's associated mapped reads) more reliable for SNP calling????
Very keen to receive as many views as possible on this and your reasons as to why/why not.
Many thanks in advance
lg36
These are three different pipelines I've used to call SNPs on the same samples:
1) Raw FastQ File --> Direct Mapping to Reference --> SNP discovered 1,200
2) Raw FastQ File --> De Novo Assembly --> Extract Paired Contigs --> Map Paired Contigs to Reference --> SNPs discovered 1,089
3) Raw FastQ File --> De Novo Assembly --> Extract All Contigs --> Map Contigs to Reference --> SNPs discovered 1,383
I know from lab data that the consensus sequence produced by method number 2 is closest to the consensus sequence of the bacteria in reality that was sequenced. Does this make method 2 (and it's associated mapped reads) more reliable for SNP calling????
Very keen to receive as many views as possible on this and your reasons as to why/why not.
Many thanks in advance
lg36
Comment