SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
How long should paired-end alignment run? agc Bioinformatics 11 09-07-2011 12:31 AM
which alignment tool allows to anchor pair end reads over repeats? Inti Bioinformatics 2 11-17-2010 01:48 PM
Fast and accurate long read alignment with Burrows-Wheeler transform. nilshomer Literature Watch 1 01-28-2010 09:38 PM
Alignment of Long SAGE tags amit491 Bioinformatics 0 01-14-2009 11:48 PM

Reply
 
Thread Tools
Old 06-19-2012, 10:30 AM   #1
naragam
Member
 
Location: Durham, NC

Join Date: Apr 2012
Posts: 21
Default Alignment tool for long reads?

Hi,

I'm just starting to work with some long reads from a PacBio sequencer (>1Kbp) and I see that my usual alignment tools like MEGA, DNA STAR, bowtie, bwa are all restricted to smaller length bp seqs (<500 bp). Does anybody have good experience with alignment tools that can handle longer reads of say >1Kbp and upto 2.5 Kbp reads?

TiA, Nash
naragam is offline   Reply With Quote
Old 06-19-2012, 03:21 PM   #2
honey
Senior Member
 
Location: Pittsburgh

Join Date: Feb 2010
Posts: 151
Default long reads

MAY be BWASW
honey is offline   Reply With Quote
Old 06-19-2012, 03:57 PM   #3
SeekAnswers
Member
 
Location: USA

Join Date: Mar 2012
Posts: 21
Default

I used Blat to do reference based scaffolding by aligning contigs to scaffolds. So that should work with long reads I assume.
SeekAnswers is offline   Reply With Quote
Old 06-19-2012, 05:56 PM   #4
honey
Senior Member
 
Location: Pittsburgh

Join Date: Feb 2010
Posts: 151
Default Blat

Agree Blat is another good option
honey is offline   Reply With Quote
Old 06-19-2012, 06:33 PM   #5
pacbio
Member
 
Location: Menlo Park, CA

Join Date: Sep 2011
Posts: 82
Default

Hi All

We have actually developed a fast and accurate aligner named BLASR (Basic Local Alignment with Successive Refinement) - http://www.smrtcommunity.com/SMRT-An...gorithms/BLASR to align our long reads. The source code for this as well as the full analysis software suite is freely available at the same PacBio DevNet site. A publication on this algorithm is also currently in review, so stay tuned.
pacbio is offline   Reply With Quote
Old 06-19-2012, 07:57 PM   #6
adaptivegenome
Super Moderator
 
Location: US

Join Date: Nov 2009
Posts: 437
Default

Quote:
Originally Posted by pacbio View Post
Hi All

We have actually developed a fast and accurate aligner named BLASR (Basic Local Alignment with Successive Refinement) - http://www.smrtcommunity.com/SMRT-An...gorithms/BLASR to align our long reads. The source code for this as well as the full analysis software suite is freely available at the same PacBio DevNet site. A publication on this algorithm is also currently in review, so stay tuned.
Thanks for posting. I look forward to the publication as I am also trying to map PacBio reads.
adaptivegenome is offline   Reply With Quote
Old 06-20-2012, 10:49 AM   #7
naragam
Member
 
Location: Durham, NC

Join Date: Apr 2012
Posts: 21
Default Re: Alignment tool for long reads

Thank you all for suggesting blat and bwa sw for aligning long reads. I am looking at the docs for bwa sw and it looks like in the command:

bwa bwasw database.fasta long_read.fastq >aln.sam

the parameter "database.fasta" is the reference and the parameter "long_read.fastq" is the sequence being aligned. Right?

So does it absolutely need the fastq file or can it just work w/o the quality data, i.e., just a *.fasta file? Also how about the ccs based output from PacBio? Anybody has tried the PacBio ccs outputs? I'm trying to get "blasr" tool from PacBio pipeline installed here, but I am not there yet...

Thanks in advance,

Nash
naragam is offline   Reply With Quote
Old 06-20-2012, 10:58 AM   #8
mchaisso
Member
 
Location: Seattle, WA

Join Date: Apr 2008
Posts: 84
Default

Hi Nash,
You can install blasr on your own using github (https://github.com/PacificBiosciences/blasr).

If you have the hdf files, there are options (-useccsdenovo) to align the ccs sequences instead of the raw subreads.

HTH,
-mark


Quote:
Originally Posted by naragam View Post
Thank you all for suggesting blat and bwa sw for aligning long reads. I am looking at the docs for bwa sw and it looks like in the command:

bwa bwasw database.fasta long_read.fastq >aln.sam

the parameter "database.fasta" is the reference and the parameter "long_read.fastq" is the sequence being aligned. Right?

So does it absolutely need the fastq file or can it just work w/o the quality data, i.e., just a *.fasta file? Also how about the ccs based output from PacBio? Anybody has tried the PacBio ccs outputs? I'm trying to get "blasr" tool from PacBio pipeline installed here, but I am not there yet...

Thanks in advance,

Nash
mchaisso is offline   Reply With Quote
Old 06-20-2012, 11:07 AM   #9
naragam
Member
 
Location: Durham, NC

Join Date: Apr 2012
Posts: 21
Default

Thank you Mark...I don't have access to hd5 files yet....they are with the core sequencing facility and I am not sure they will give me those right now.... But I am working with them to gradually get some of the pipeline tools locally on my new Ubuntu machine that still needs memory upgrades before I can run your pipeline tools...

Yeah, I hope to run blasr soon but, in the meantime, I am trying to learn some of these long read tools that I haven't worked with before. Do you know if you have to have the fastq files for bwa sw?

Nash
naragam is offline   Reply With Quote
Old 06-20-2012, 11:50 AM   #10
mchaisso
Member
 
Location: Seattle, WA

Join Date: Apr 2008
Posts: 84
Default

Quote:
Originally Posted by naragam View Post
Yeah, I hope to run blasr soon but, in the meantime, I am trying to learn some of these long read tools that I haven't worked with before. Do you know if you have to have the fastq files for bwa sw?

Nash
bwa sw aligns fasta sequences.

You will want the bas.h5 files since they have additional information about subread coordinates.
mchaisso is offline   Reply With Quote
Old 06-21-2012, 07:23 AM   #11
naragam
Member
 
Location: Durham, NC

Join Date: Apr 2012
Posts: 21
Default blasr compilation

Mark,

Am trying to compile blasr on my machine and am missing some header files in the tar file distribution. Can you please point me to sources who can help me or provide the *.h files needed? Thanks much,

Nash
naragam is offline   Reply With Quote
Old 06-21-2012, 10:42 AM   #12
JackieBadger
Senior Member
 
Location: Halifax, Nova Scotia

Join Date: Mar 2009
Posts: 381
Default

A quicker refined flavour of BLAT is BFAST :http://www.plosone.org/article/info:...l.pone.0007767
JackieBadger is offline   Reply With Quote
Old 06-21-2012, 12:35 PM   #13
naragam
Member
 
Location: Durham, NC

Join Date: Apr 2012
Posts: 21
Default PacBio "blasr" questions....

Perhaps, I should really start a new thread...but, does anybody on this forum have good experience with blasr alignments to discuss the various options for the run and further the several output formats that are available. I have just started playing with some of the balsr runs and I have some pointed questions that I'd like to ask and/or seek detailed docs to refer to in terms of understanding all the options and outputs.

Any help available in this forum?

Thanks much in advance for any pointers,

Nash
naragam is offline   Reply With Quote
Old 06-21-2012, 01:48 PM   #14
mchaisso
Member
 
Location: Seattle, WA

Join Date: Apr 2008
Posts: 84
Default

Quote:
Originally Posted by naragam View Post
Perhaps, I should really start a new thread...but, does anybody on this forum have good experience with blasr alignments to discuss the various options for the run and further the several output formats that are available. I have just started playing with some of the balsr runs and I have some pointed questions that I'd like to ask and/or seek detailed docs to refer to in terms of understanding all the options and outputs.

Any help available in this forum?

Thanks much in advance for any pointers,

Nash
You could say I'm pretty familiar with blasr output (I'm the author).

Most of the help may be found by running blasr -h, or blasr -help for detailed help. There are many output formats including tabular ones for which you can get column labels with the -header option, human readable output (-m 0), and sam (specified by -sam).

-mark
mchaisso is offline   Reply With Quote
Old 06-22-2012, 05:47 AM   #15
naragam
Member
 
Location: Durham, NC

Join Date: Apr 2012
Posts: 21
Default blasr output

Mark,

That's great to know....I have printed out the help pages, but there are still unanswered questions for me...would you like to take this discussion offline or do you want me to post the questions right here? If there's a special PacBio support site for blasr, I can reach you through that...Please let me know your convenience. Thanks much,

Nash
naragam is offline   Reply With Quote
Old 04-05-2013, 05:53 PM   #16
shi
Wei Shi
 
Location: Australia

Join Date: Feb 2010
Posts: 235
Default

Dear Nash,

You may try out the Subread aligner (http://subread.sourceforge.net), which can align reads as long as 1200bp.

Cheers,
Wei
shi is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:57 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO