SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Find unmapped read from sam/bam file genelab Bioinformatics 9 03-18-2014 01:35 PM
how to know whether a bam file is strand specific? jay2008 Bioinformatics 0 11-27-2012 11:12 PM
Visualisation issue with mapped BAM file - how do I find my region of interest... TabeaK Bioinformatics 1 11-14-2012 05:06 AM
subsetting a bam file with specific alingment length joseph Bioinformatics 4 12-28-2011 07:39 PM
Find all occurrences of a sequence in a fasta file dphansti Bioinformatics 3 12-06-2011 06:11 AM

Reply
 
Thread Tools
Old 07-01-2013, 07:01 AM   #1
volavii
Member
 
Location: GERMANY

Join Date: Apr 2013
Posts: 16
Default how do i find a specific sequence in a bam file?

I want to extract special sequences out of my bam-file (reference-mapped with BWA).

normally i do that with blast or blat, but this time i have a bam file, not a ready-to-use genome...
do i have to assemble the mapped reads into a consensus sequence in bevore, or is it possible to first (1.) identify the respective scaffold via the reference-genome with blat, (2.) assemble the reads that mapped to this scaffold and then (3.) aligne my sequence to that assembled scaffold?Or is this idea totally stupid? :/

I have never done an assembly so far. Iam really unsure what is the right way here...

Which tool would you suggest for assembling a bam-file, when dealing with genomes of >2 GB? And how shoud I care fore heterozygous positions?


so many thanks in advance,
hope anyone can give me here some help
volavii is offline   Reply With Quote
Old 07-01-2013, 07:33 AM   #2
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

It is hard for me to understand what you need. It appears that you already have reference-mapped data thus denovo assembly does not seem to required. If instead you are asking either (a) how to extract reads of a certain region or (b) how to call SNPs/Indels then you should look at the samtools 'mpileup' command. See http://samtools.sourceforge.net/mpileup.shtml for a starting place on mpileup.
westerman is offline   Reply With Quote
Old 07-01-2013, 07:44 AM   #3
amarth
Member
 
Location: Mexico City

Join Date: Dec 2012
Posts: 14
Default

Have you tried to convert your BAM file into a SAM or Fasta?

Use in the command line:

Quote:
$ samtools view -h -o out.sam in.bam
The SAM file will provide the mapped reads into a specific scaffold. There you can retrieve your reads. Then you can assemble the reads
amarth is offline   Reply With Quote
Old 07-02-2013, 12:52 AM   #4
volavii
Member
 
Location: GERMANY

Join Date: Apr 2013
Posts: 16
Default

first of all, thanks for answering!

sorry for my imprecise question.

What i have done so far is a reference-mapping of 5 genomes. i used BWA for that. everything worked well.

Now i want to have the sequence of ~25 genes outof these 5 inidividuals. And i am not really sure how to do it.
I have no experience how much time it takes to assemble the reads of these 5 individuals to a consesus sequence... therefore i thougth: maybe it is enought to just assemble the parts of the individuals in which the genes are lying...

@amarth: which program do you suggest for the assembly. And how should i care for heterozygous sites? Shoud I use ambiguity code in the final gene sequence, or "Ns"?

Thanks!
volavii is offline   Reply With Quote
Reply

Tags
assembly, bam, heterozygote, mapping, phasing

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:49 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO