![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Calculate depth of every base position | Lv Ray | Bioinformatics | 5 | 10-17-2014 11:04 PM |
base coverage per position on scaffolds | boetsie | Bioinformatics | 5 | 01-23-2014 05:44 AM |
base position | cmccabe | Bioinformatics | 1 | 01-15-2014 06:08 AM |
How to find out the base on a certain position | shuang | Bioinformatics | 8 | 09-09-2011 06:45 AM |
MAQ: finding the stop position of an alignment | scirocco | Bioinformatics | 0 | 12-28-2008 11:33 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Senior Member
Location: oceania Join Date: Feb 2014
Posts: 115
|
![]()
Hi All,
I have seen some other threads on this subject, sorry I don't have the links. But none of them seem to resolve the problem. I am trying to do something very simple but no tools for bam files seem to have the facility. I have tried samtools and GATK to no avail. All I need is to get the consensus read base at particular positions of the reference in a bam file. I am trying to check the state of SNPs from a particular bacterial SNP typing scheme in my NGS data. Not all of the positions will actually be a SNP, in which cae they have the same base as the reference, and I want that reported, so .vcf is no good to me. Greatly appreciate any suggestions. I suppose I can just use samtools to make the vcf and then if a position is not listed then assume it is the reference base. I do find samtools almost impenetrable.. would this line be appropriate? samtools mpileup -I -uBf ref.fasta 1_sorted.bam | bcftools view -bcg - > 1.bcf && bcftools view 1.bcf > 1.vcf but I don't understand the output vcf file because it seems to have a '.' for the reference base and then the proper A T C or G for the 'alt' base(s). Thanks for any help, S. |
![]() |
![]() |
![]() |
#2 |
Registered Vendor
Location: Eugene, OR Join Date: May 2013
Posts: 521
|
![]()
I do think the mpileup command is what you want. I am not sure what your pipeline is making, but it sounds like the pileup format (. or , for the reference base with ATCG with alternates).
You can output a vcf with genotype compared to the reference for each position: samtools mpileup -gu -f ref.fasta sorted.bam | bcftools call -c - > all.vcf The genotype will be 0/0 for homozygous reference, 0/1 for heterozygous with the alternate listed allele, and 1/1 for homozygous alternate. bcftools call -cv would output the variants only. You might want to use bcftools consensus to create a modified fasta file with your variants: http://www.htslib.org/doc/bcftools.html#consensus
__________________
Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com |
![]() |
![]() |
![]() |
#3 |
Senior Member
Location: oceania Join Date: Feb 2014
Posts: 115
|
![]()
Thanks SNPsaurus, I will try that. But I've now found that the published SNP locations I have been using have mistakes so have to go back a step..
S. |
![]() |
![]() |
![]() |
#4 |
Senior Member
Location: San Diego Join Date: May 2008
Posts: 912
|
![]()
samtools mpileup should also take as a commend line entry a position, or a bed file of positions, so you don't need to generate a line for every base in the genome.
|
![]() |
![]() |
![]() |
Thread Tools | |
|
|