hi all,
I applied samtools mpileup to 7 exom-seq samples(human), whose bam biles were generated using BWA. Since I used a for loop to process the samples, the output should be similar. However, for one of the samples, the mpileup file contains only ~1000 lines, with a few lines for each chromosome. Other samples' mpileup files look good with many many more lines.
I used samtools flagstat to check the BAM file and found that
very few reads (only 34) were properly paired. I wonder if this is the reason that cause the mpileup problem. More important, does that show something wrong for the library preparation in the sequencing experiment?
95045628 + 0 in total (QC-passed reads + QC-failed reads)
31755322 + 0 duplicates
95045628 + 0 mapped (100.00%:nan%)
95045628 + 0 paired in sequencing
47522831 + 0 read1
47522797 + 0 read2
34 + 0 properly paired (0.00%:nan%)
..
In contrast, the properly paired reads are many in other samples, e.g.,:
120529538 + 0 in total (QC-passed reads + QC-failed reads)
27401894 + 0 duplicates
120529538 + 0 mapped (100.00%:nan%)
120529538 + 0 paired in sequencing
60469618 + 0 read1
60059920 + 0 read2
119251912 + 0 properly paired (98.94%:nan%)
(In all BAM files, I removed unmapped reads, so do not be surprised that mapping rate is 100%.)
I applied samtools mpileup to 7 exom-seq samples(human), whose bam biles were generated using BWA. Since I used a for loop to process the samples, the output should be similar. However, for one of the samples, the mpileup file contains only ~1000 lines, with a few lines for each chromosome. Other samples' mpileup files look good with many many more lines.
I used samtools flagstat to check the BAM file and found that
very few reads (only 34) were properly paired. I wonder if this is the reason that cause the mpileup problem. More important, does that show something wrong for the library preparation in the sequencing experiment?
95045628 + 0 in total (QC-passed reads + QC-failed reads)
31755322 + 0 duplicates
95045628 + 0 mapped (100.00%:nan%)
95045628 + 0 paired in sequencing
47522831 + 0 read1
47522797 + 0 read2
34 + 0 properly paired (0.00%:nan%)
..
In contrast, the properly paired reads are many in other samples, e.g.,:
120529538 + 0 in total (QC-passed reads + QC-failed reads)
27401894 + 0 duplicates
120529538 + 0 mapped (100.00%:nan%)
120529538 + 0 paired in sequencing
60469618 + 0 read1
60059920 + 0 read2
119251912 + 0 properly paired (98.94%:nan%)
(In all BAM files, I removed unmapped reads, so do not be surprised that mapping rate is 100%.)
Comment