Hello,
I'm having some issues with mpileup and am hoping someone might be able to shed some light on what I'm doing wrong. I used BWA to map PE Illumina reads to a fasta reference that contains about 20 individual transcripts from a closely related species. I was able to produce the appropriate sam, bam, and sorted bam files. Using samtools tviewer, I was able to look at the mapped reads for all of the genes in my reference file, and I can clearly see SNPs (and there is fairly good coverage). However, when I: samtools mpileup ref.fasta BWAout.sorted.bam > BWAout.mpileup the resulting mpileup is only made up of 45 lines, all of which seem to reference a single gene in the reference. Output is below. I find this very confusing, because as I mentioned before, I had reads that mapped to each of my reference genes, and it is easy to see SNPs during visual inspection.
Any ideas?? Thanks in advance.
gi|70778973|ref|NM_001025313.1| 1990 G 1 ^]. @
gi|70778973|ref|NM_001025313.1| 1991 T 2 .^], @@
gi|70778973|ref|NM_001025313.1| 1992 A 2 ., <C
gi|70778973|ref|NM_001025313.1| 1993 G 2 ., DE
gi|70778973|ref|NM_001025313.1| 1994 A 2 ., DC
gi|70778973|ref|NM_001025313.1| 1995 C 2 ., DC
gi|70778973|ref|NM_001025313.1| 1996 C 2 ., :A
gi|70778973|ref|NM_001025313.1| 1997 A 2 ., BC
gi|70778973|ref|NM_001025313.1| 1998 G 2 ., ?@
gi|70778973|ref|NM_001025313.1| 1999 G 2 ., D5
gi|70778973|ref|NM_001025313.1| 2000 C 2 ., D8
gi|70778973|ref|NM_001025313.1| 2001 T 2 ., B(
gi|70778973|ref|NM_001025313.1| 2002 G 2 ., A;
gi|70778973|ref|NM_001025313.1| 2003 G 2 ., 8:
gi|70778973|ref|NM_001025313.1| 2004 C 2 ., 1D
gi|70778973|ref|NM_001025313.1| 2005 C 2 ., A3
gi|70778973|ref|NM_001025313.1| 2006 T 2 ., 3B
gi|70778973|ref|NM_001025313.1| 2007 C 2 ., C?
gi|70778973|ref|NM_001025313.1| 2008 G 2 ., <D
gi|70778973|ref|NM_001025313.1| 2009 A 2 ., E:
gi|70778973|ref|NM_001025313.1| 2010 A 2 ., B*
gi|70778973|ref|NM_001025313.1| 2011 C 2 ., ED
gi|70778973|ref|NM_001025313.1| 2012 T 2 ., G9
gi|70778973|ref|NM_001025313.1| 2013 C 2 ., GC
gi|70778973|ref|NM_001025313.1| 2014 A 2 ., 9C
gi|70778973|ref|NM_001025313.1| 2015 G 2 ., 9D
gi|70778973|ref|NM_001025313.1| 2016 A 2 ., 9<
gi|70778973|ref|NM_001025313.1| 2017 A 2 ., :C
gi|70778973|ref|NM_001025313.1| 2018 A 2 ., C0
gi|70778973|ref|NM_001025313.1| 2019 T 2 ., F:
gi|70778973|ref|NM_001025313.1| 2020 C 2 ., DE
gi|70778973|ref|NM_001025313.1| 2021 C 2 ., BE
gi|70778973|ref|NM_001025313.1| 2022 G 2 ., E<
gi|70778973|ref|NM_001025313.1| 2023 C 2 ., ?C
gi|70778973|ref|NM_001025313.1| 2024 C 2 ., AD
gi|70778973|ref|NM_001025313.1| 2025 T 2 ., FD
gi|70778973|ref|NM_001025313.1| 2026 G 2 ., G?
gi|70778973|ref|NM_001025313.1| 2027 C 2 ., 9D
gi|70778973|ref|NM_001025313.1| 2028 C 2 ., DD
gi|70778973|ref|NM_001025313.1| 2029 T 2 .$, >>
gi|70778973|ref|NM_001025313.1| 2030 C 1 , D
gi|70778973|ref|NM_001025313.1| 2031 T 1 , D
gi|70778973|ref|NM_001025313.1| 2032 G 1 , ?
gi|70778973|ref|NM_001025313.1| 2033 C 1 , ?
gi|70778973|ref|NM_001025313.1| 2034 C 1 ,$ ?
I'm having some issues with mpileup and am hoping someone might be able to shed some light on what I'm doing wrong. I used BWA to map PE Illumina reads to a fasta reference that contains about 20 individual transcripts from a closely related species. I was able to produce the appropriate sam, bam, and sorted bam files. Using samtools tviewer, I was able to look at the mapped reads for all of the genes in my reference file, and I can clearly see SNPs (and there is fairly good coverage). However, when I: samtools mpileup ref.fasta BWAout.sorted.bam > BWAout.mpileup the resulting mpileup is only made up of 45 lines, all of which seem to reference a single gene in the reference. Output is below. I find this very confusing, because as I mentioned before, I had reads that mapped to each of my reference genes, and it is easy to see SNPs during visual inspection.
Any ideas?? Thanks in advance.
gi|70778973|ref|NM_001025313.1| 1990 G 1 ^]. @
gi|70778973|ref|NM_001025313.1| 1991 T 2 .^], @@
gi|70778973|ref|NM_001025313.1| 1992 A 2 ., <C
gi|70778973|ref|NM_001025313.1| 1993 G 2 ., DE
gi|70778973|ref|NM_001025313.1| 1994 A 2 ., DC
gi|70778973|ref|NM_001025313.1| 1995 C 2 ., DC
gi|70778973|ref|NM_001025313.1| 1996 C 2 ., :A
gi|70778973|ref|NM_001025313.1| 1997 A 2 ., BC
gi|70778973|ref|NM_001025313.1| 1998 G 2 ., ?@
gi|70778973|ref|NM_001025313.1| 1999 G 2 ., D5
gi|70778973|ref|NM_001025313.1| 2000 C 2 ., D8
gi|70778973|ref|NM_001025313.1| 2001 T 2 ., B(
gi|70778973|ref|NM_001025313.1| 2002 G 2 ., A;
gi|70778973|ref|NM_001025313.1| 2003 G 2 ., 8:
gi|70778973|ref|NM_001025313.1| 2004 C 2 ., 1D
gi|70778973|ref|NM_001025313.1| 2005 C 2 ., A3
gi|70778973|ref|NM_001025313.1| 2006 T 2 ., 3B
gi|70778973|ref|NM_001025313.1| 2007 C 2 ., C?
gi|70778973|ref|NM_001025313.1| 2008 G 2 ., <D
gi|70778973|ref|NM_001025313.1| 2009 A 2 ., E:
gi|70778973|ref|NM_001025313.1| 2010 A 2 ., B*
gi|70778973|ref|NM_001025313.1| 2011 C 2 ., ED
gi|70778973|ref|NM_001025313.1| 2012 T 2 ., G9
gi|70778973|ref|NM_001025313.1| 2013 C 2 ., GC
gi|70778973|ref|NM_001025313.1| 2014 A 2 ., 9C
gi|70778973|ref|NM_001025313.1| 2015 G 2 ., 9D
gi|70778973|ref|NM_001025313.1| 2016 A 2 ., 9<
gi|70778973|ref|NM_001025313.1| 2017 A 2 ., :C
gi|70778973|ref|NM_001025313.1| 2018 A 2 ., C0
gi|70778973|ref|NM_001025313.1| 2019 T 2 ., F:
gi|70778973|ref|NM_001025313.1| 2020 C 2 ., DE
gi|70778973|ref|NM_001025313.1| 2021 C 2 ., BE
gi|70778973|ref|NM_001025313.1| 2022 G 2 ., E<
gi|70778973|ref|NM_001025313.1| 2023 C 2 ., ?C
gi|70778973|ref|NM_001025313.1| 2024 C 2 ., AD
gi|70778973|ref|NM_001025313.1| 2025 T 2 ., FD
gi|70778973|ref|NM_001025313.1| 2026 G 2 ., G?
gi|70778973|ref|NM_001025313.1| 2027 C 2 ., 9D
gi|70778973|ref|NM_001025313.1| 2028 C 2 ., DD
gi|70778973|ref|NM_001025313.1| 2029 T 2 .$, >>
gi|70778973|ref|NM_001025313.1| 2030 C 1 , D
gi|70778973|ref|NM_001025313.1| 2031 T 1 , D
gi|70778973|ref|NM_001025313.1| 2032 G 1 , ?
gi|70778973|ref|NM_001025313.1| 2033 C 1 , ?
gi|70778973|ref|NM_001025313.1| 2034 C 1 ,$ ?
Comment