SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
454 data nitinkumar Bioinformatics 4 02-23-2011 02:24 PM
Can MAQ be used for 454 sequencing data for SNP detection? ketan_bnf Bioinformatics 5 11-16-2010 08:30 PM
data for Maq [email protected] Bioinformatics 4 08-10-2010 02:02 PM
sff_extract: combining data from 454 Flx and Titanium data sets agroster Bioinformatics 7 01-14-2010 11:19 AM
Run maq on solexa data with simulated qual scores AnamikaDarwin Bioinformatics 0 05-22-2009 11:52 AM

Reply
 
Thread Tools
Old 07-18-2010, 07:42 AM   #1
litali
Member
 
Location: us

Join Date: Jul 2010
Posts: 78
Default maq for 454 data?

Hi,
Is it possible to use Maq for 454 data? which are the input files needed? if no, is there anything similar for 454?
litali is offline   Reply With Quote
Old 07-22-2010, 06:34 PM   #2
Naujv
Junior Member
 
Location: USA

Join Date: Jul 2010
Posts: 6
Default

Hi litali,

I'm not sure if it's the right thing to do, the MAQ website FAQ section actually says: "Maq maps short reads to the reference and calls the genotypes from the alignment. It is speficially designed for Illumina-Solexa/AB-SOLiD reads, not for 454 or capillary ones." Personally, I'd like to use it too since I'm not too hot on Roche software.

If you want to run it and see for yourself, you can convert your sff files into sanger fastq (several methods below), make a reference fasta file, and follow the commands shown at the MAQ website (there's an easyrun).

Join qual and fna file into fastq:
(a) http://seqanswers.com/forums/showthread.php?t=2775
Convert sff into fastq:
(b) There's also sff2fastq at github.

The files came out the same with either code.

Last edited by Naujv; 07-23-2010 at 12:39 AM.
Naujv is offline   Reply With Quote
Old 07-23-2010, 10:09 AM   #3
Naujv
Junior Member
 
Location: USA

Join Date: Jul 2010
Posts: 6
Default

Hi litati,

I just tried using maq on my 454 data. What I'm seeing from the alignment (maq mapview all.map > $someoutputfile) are my sequences are being cut off at 34 nts.
Naujv is offline   Reply With Quote
Old 07-23-2010, 12:45 PM   #4
krobison
Senior Member
 
Location: Boston area

Join Date: Nov 2007
Posts: 747
Default

Try bwasw (a mode of bwa)
krobison is offline   Reply With Quote
Old 07-23-2010, 01:28 PM   #5
jgibbons1
Senior Member
 
Location: Worcester, MA

Join Date: Oct 2009
Posts: 130
Default

I'm pretty sure MAQ can only map reads 63bp or smaller.
jgibbons1 is offline   Reply With Quote
Old 07-23-2010, 02:59 PM   #6
Naujv
Junior Member
 
Location: USA

Join Date: Jul 2010
Posts: 6
Default

krobison, thanks! I appreciate your input.

Took your advice and tried bwasw, but sort of ran into a problem with my alignments. My CIGAR string has "S" in them. Found another post where I guess there's a problem with the CIGAR ??

If you have time, I would like your thoughts (and others) regarding using bwasw for reference sequences (not whole genome and not whole chromosomes). Mine are made up of 100 non-overlapping sequences in fasta format.
Naujv is offline   Reply With Quote
Old 07-23-2010, 04:34 PM   #7
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Quote:
Originally Posted by Naujv View Post
krobison, thanks! I appreciate your input.

Took your advice and tried bwasw, but sort of ran into a problem with my alignments. My CIGAR string has "S" in them. Found another post where I guess there's a problem with the CIGAR ??

If you have time, I would like your thoughts (and others) regarding using bwasw for reference sequences (not whole genome and not whole chromosomes). Mine are made up of 100 non-overlapping sequences in fasta format.
The "S" character indicates soft-clipping, which is described in the SAM specification. If you still think it is a bug, could you post the SAM record in question?
nilshomer is offline   Reply With Quote
Old 07-23-2010, 06:14 PM   #8
Naujv
Junior Member
 
Location: USA

Join Date: Jul 2010
Posts: 6
Default

nils thanks for the help! i'm new (as in today new) to bwa. it may not be an error/bug, though i tried to look where the alignment is, i couldn't find it. the sequence looks like one big ugly repeat, so maybe this is spurrious. maybe you can help me understand what 40 in the line is? mapping quality? where does good and bad lie?

GKTESVC03GKDWH 16 ref|NG_023054.1|:5000-113024 77792 40 46S48M143S * 0 0 ATTCCATTCCATTCCATTCGGTTTNAACGGTATTCCAATCGATTCCATTCCATTCCATTCCATTCCATTCCATTCCATTCCTTTCCATTCCATTACGGATGATTCCATTCCATTGCATTCCATTCCATTCCATTCCCCTGTACTCGGGTTGATTCCATTCCATTCCATTCCAATCCATGCCATTCCACTCGTGTTGATTCCATTCTTTCCATTCCATTCAAGTTGATTCCATTCCAT .199;:992131111:.,.--,,,!--.--17995566999:=BBABBBBBDDDAAA????DAAAAADBBBAA>=<900000..22:;9;;<62444444<<==>=>>>>>AB===A?????DDDDFFDDFFFF;;99<[email protected]@<<44488ABBBBDDDFFF[email protected]A????FCCDDHF
AS:i:44 XS:i:0 XF:i:2 XE:i:6 XN:i:0
Naujv is offline   Reply With Quote
Old 07-23-2010, 11:23 PM   #9
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Quote:
Originally Posted by Naujv View Post
nils thanks for the help! i'm new (as in today new) to bwa. it may not be an error/bug, though i tried to look where the alignment is, i couldn't find it. the sequence looks like one big ugly repeat, so maybe this is spurrious. maybe you can help me understand what 40 in the line is? mapping quality? where does good and bad lie?

GKTESVC03GKDWH 16 ref|NG_023054.1|:5000-113024 77792 40 46S48M143S * 0 0 ATTCCATTCCATTCCATTCGGTTTNAACGGTATTCCAATCGATTCCATTCCATTCCATTCCATTCCATTCCATTCCATTCCTTTCCATTCCATTACGGATGATTCCATTCCATTGCATTCCATTCCATTCCATTCCCCTGTACTCGGGTTGATTCCATTCCATTCCATTCCAATCCATGCCATTCCACTCGTGTTGATTCCATTCTTTCCATTCCATTCAAGTTGATTCCATTCCAT .199;:992131111:.,.--,,,!--.--17995566999:=BBABBBBBDDDAAA????DAAAAADBBBAA>=<900000..22:;9;;<62444444<<==>=>>>>>AB===A?????DDDDFFDDFFFF;;99<[email protected]@<<44488ABBBBDDDFFF[email protected]A????FCCDDHF
AS:i:44 XS:i:0 XF:i:2 XE:i:6 XN:i:0
Have a real close read of the SAM specification. You will be going back to this quite a bit. The 5th column is the PHRED-scaled mapping quality. Looking at the CIGAR field (6th column), "46S48M143S", there seems to be 48 bases matching your reference, with the first 46 and last 143 soft-clipped.
nilshomer is offline   Reply With Quote
Old 07-24-2010, 01:20 AM   #10
Naujv
Junior Member
 
Location: USA

Join Date: Jul 2010
Posts: 6
Default

Thank you! Going through 2008 MAQ paper now.
Naujv is offline   Reply With Quote
Old 07-26-2010, 12:02 AM   #11
geschickten
Member
 
Location: India

Join Date: Jul 2009
Posts: 31
Default

Hi All,

We have a MAQ that works with 125bp Illumina read and we also a version that works with 454 data. Its not in open domain. If anybody is interested then please send me a request at [email protected]. Thanks.
geschickten is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:01 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO