I am having trouble with the samtools pileup command and would appreciate any advice. I'm probably doing something wrong, but I have read the documentation and can't figure out what. Originally I was starting with a sam file produced by BWA. But that wasn't working so I switched to a sam file generated by using bowtie2sam.pl on a bowtie file.
I used the following command to create the unsorted bam file:
samtools view -buSt chrM.fa -o file.bam file.sam
And then the following to sort the file:
samtools sort file.bam file.bam.sorted
Then I tried a variety of things to produce the pileup file.
samtools pileup -f chrM.fa file.bam.sorted.bam
--> no output, no error
samtools pileup -t chrM.fa.fai file.bam.sorted.bam
samtools pileup -f chrM.fa -t chrM.fa.fai file.bam.sorted.bam
-->
[sam_header_read2] 1 sequences loaded.
[sam_read1] reference '$(((((((((((((((((((((((((NMCX1CMDZ7N0N27' is recognized as '*'.
[sam_read1] reference '(((((((((((((((((((((((((((((NMCX1CMDZ7N0N27' is recognized as '*'.
Parse error at line 2: invalid CIGAR character
Abort trap
samtools pileup -f chrM.fa -S file.sam
-->
[samopen] no @SQ lines in the header.
[sam_read1] missing header? Abort!
samtools pileup -f chrM.fa file.sam
-->
[bam_pileup] fail to read the header: non-exisiting file or wrong format.
The only way I could get pileup output was as follows:
samtools pileup -t chrM.fa.fai file.sam
samtools pileup -t chrM.fa.fai -S file.sam
--> note there is no -S flag in the first one and yet it produced correct output
I want to be able to use the pileup command on the bam file because I am assuming it will be much faster on large files. Can I do that?
And I was told that it's better to use -f than -t--is that true, or does it not matter?
I used the following command to create the unsorted bam file:
samtools view -buSt chrM.fa -o file.bam file.sam
And then the following to sort the file:
samtools sort file.bam file.bam.sorted
Then I tried a variety of things to produce the pileup file.
samtools pileup -f chrM.fa file.bam.sorted.bam
--> no output, no error
samtools pileup -t chrM.fa.fai file.bam.sorted.bam
samtools pileup -f chrM.fa -t chrM.fa.fai file.bam.sorted.bam
-->
[sam_header_read2] 1 sequences loaded.
[sam_read1] reference '$(((((((((((((((((((((((((NMCX1CMDZ7N0N27' is recognized as '*'.
[sam_read1] reference '(((((((((((((((((((((((((((((NMCX1CMDZ7N0N27' is recognized as '*'.
Parse error at line 2: invalid CIGAR character
Abort trap
samtools pileup -f chrM.fa -S file.sam
-->
[samopen] no @SQ lines in the header.
[sam_read1] missing header? Abort!
samtools pileup -f chrM.fa file.sam
-->
[bam_pileup] fail to read the header: non-exisiting file or wrong format.
The only way I could get pileup output was as follows:
samtools pileup -t chrM.fa.fai file.sam
samtools pileup -t chrM.fa.fai -S file.sam
--> note there is no -S flag in the first one and yet it produced correct output
I want to be able to use the pileup command on the bam file because I am assuming it will be much faster on large files. Can I do that?
And I was told that it's better to use -f than -t--is that true, or does it not matter?
Comment