Hi,
I am using mpileup (samtools v0.1.16 (r963:234)) with the -r option to specify the size of a region for which the pileup should be generated. Thereby, I noticed that the number of reads covering a position of interest changes with increasing size of the specified region.
For example if I'm interested in the bases and qualities at position 119121582 on chromosome 12 I get for
a single position:
> 8007 reads covering that site
a region of 10 bases:
> 7584 reads covering that site
region of 1000 bases:
> 6661 reads covering that site
This ultimately also means that when running mpileup without the -r option on the whole bam file, 6661 reads at that site are being reported while in total 8007 seem to cover it.
I would have assumed that no matter which size of a region I choose around a site, this particular site will always be covered by the same number of reads which are reported in the pileup format. Could anyone explain to me why this is not the case?
Thanks a lot in advance
I am using mpileup (samtools v0.1.16 (r963:234)) with the -r option to specify the size of a region for which the pileup should be generated. Thereby, I noticed that the number of reads covering a position of interest changes with increasing size of the specified region.
For example if I'm interested in the bases and qualities at position 119121582 on chromosome 12 I get for
a single position:
Code:
samtools mpileup -f reference-genome.fa -r chr12:11912[COLOR="DarkSlateGray"]1582[/COLOR]-11912[COLOR="DarkSlateGray"]1582[/COLOR] input.bam | grep 119121582 | awk '{print $4}'
a region of 10 bases:
Code:
samtools mpileup -f reference-genome.fa -r chr12:11912[COLOR="DarkSlateGray"]1580[/COLOR]-11912[COLOR="DarkSlateGray"]1590[/COLOR] input.bam | grep 119121582 | awk '{print $4}'
region of 1000 bases:
Code:
samtools mpileup -f reference-genome.fa -r chr12:11912[COLOR="DarkSlateGray"]1000[/COLOR]-11912[COLOR="DarkSlateGray"]2000[/COLOR] input.bam | grep 119121582 | awk '{print $4}'
This ultimately also means that when running mpileup without the -r option on the whole bam file, 6661 reads at that site are being reported while in total 8007 seem to cover it.
I would have assumed that no matter which size of a region I choose around a site, this particular site will always be covered by the same number of reads which are reported in the pileup format. Could anyone explain to me why this is not the case?
Thanks a lot in advance
Comment