![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
sff files, fasta and fastq | Feenix | 454 Pyrosequencing | 4 | 06-26-2014 05:43 AM |
Generating SFF files | Xterra | 454 Pyrosequencing | 8 | 10-31-2011 01:07 PM |
creating Roche's sff files | enna80 | Bioinformatics | 5 | 11-10-2010 05:30 AM |
sff 454 files into fasta | Peruano | 454 Pyrosequencing | 4 | 03-08-2010 01:21 PM |
Difference between .sff files and .fq file | edge | Bioinformatics | 3 | 10-04-2009 06:30 PM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: UK Join Date: Jan 2009
Posts: 15
|
![]()
Hi can anybody suggest good assembly programs, other than Newbler and MIRA, which can use .sff files directly as an input file, not fasta.
Also, I have generated an .ace file from newbler which is not fully compatible with consed (I can open the file in consed but for some reason the contig number look different). Could anybody suggest good programs, which I can use to finish a 454 generated genome? something that will allow me to view the scaffolds and join or break where needed. I've tried consed and staden, any others would be greatly appreciated!! Thanks in advance! Raj |
![]() |
![]() |
![]() |
#2 |
Member
Location: UK Join Date: Jan 2009
Posts: 15
|
![]()
...I was informed yesterday that the new version of consed (v18) should now be fully compatible with 454 data.
Also, with proposed release of Gap5, this too should also resolve the incompatibility issues, many programs seem to have when trying to finish 454 generated data. Using MIRA and Newbler, seem to be the best methods for assembling 454 data, so that the pair end data can be fully taken advantage of. Finishing is still the bottleneck for which, i hope the new versions of Consed and Gap can resolve... |
![]() |
![]() |
![]() |
#3 |
Member
Location: Eesti Join Date: Jan 2009
Posts: 37
|
![]()
yes, consed 18 is out for few weeks, you need update for phrap as well.
I did not have any problems with installation (32-bit Fedora 10) anyway, it does not perform de novo assembly of 454 reads, right? however it reads Newbler .ssf files, and allows assemble 454 reads to the reference sequence. please correct me when I am wrong... |
![]() |
![]() |
![]() |
#4 |
Senior Member
Location: Berlin, DE Join Date: May 2008
Posts: 628
|
![]()
.. and it can directly read newbler created ace files. So if you like newbler, no problem.
Maybe it's a good starting point for finishing a (shotgun) project if there is no sanger backbone. A good alternative might be MIRA which writes a CAF file (which can be easily converted to gap4). But gap4 might slow down if you have a huge dataset ... For larger assemblies you might want to have a look at Celera Assembler which in our hands makes a good job with sanger/454(FLX) hybrid assemblies in the bacterial genome size range. Just my 2p, Sven |
![]() |
![]() |
![]() |
#5 |
Junior Member
Location: Wisconsin Join Date: Jan 2009
Posts: 6
|
![]()
Has anyone assembled 454 data with consed package version 19? I'm having some issues with reading of the .sff files and wondering if anyone has completed an assembly of 454 data (not using Roche software produced .ace files). I'm using "add454Reads.perl reference.ace sff.fof reference.fa", where the fof specifies the location and sff files to assembly, but although the script runs, I get an error "doesn't existile /shared/BNFinal/mapping/consed/sff_dir/FPDLD6P02.sff", and the 454 reads are not brought into the assembly; it basically assembles with only the reference sequence. Someone mentioned needing to update phrap, which I will look into, but any other thoughts on this?
Thanks, Liz |
![]() |
![]() |
![]() |
#6 | |
Senior Member
Location: Berlin, DE Join Date: May 2008
Posts: 628
|
![]()
Hi Liz,
Quote:
You should update to the current version of phrap, as cross_macch is updated as well. Phrap is not involved in the task of aligning 454 reads against your refseq; cross_match is used for that. cheers, Sven |
|
![]() |
![]() |
![]() |
#7 |
Junior Member
Location: Wisconsin Join Date: Jan 2009
Posts: 6
|
![]()
hi Sven. Thanks for the post. I checked that a few times to make sure I'm not going crazy and yes the sff file is where I specified in the fof. Here are the steps I'm following. Any help much appreocated:
1.Ran gsMapper (through UI) using the option to create a Complete consed folder 2.Deleted the .consedrc file that Newbler created in edit_dir (per v19 instructions) 3.Deleted the phd.ball link in edit_dir (per v19 instructions) 4.Checked that the current version of sff2scf is the one to be used. Type "sff2scf -v". It gives "080714" 5.Created an .ace file from appropriate fasta format reference sequence: fasta2Ace.perl reference.fa 6.Created a sff.fof containing the name of the appropriate sff files - used a single .sff file. The sff.fof therefore contains only the name of the .sff file “ FMAAUWB12.sff “; no path etc.. The sff.fof file is - located in edit_dir and from here the FMAAUWB12.sff file is in ../sff_dir 7.Add reads from edit_dir directory run: add454Reads.perl reference.ace sff.fof reference.fa 8.Get: doesn't existile FMAAUWB12.sff 0.0 minutes to until done with alignments now using alignments to add reads to ace file executing: /usr/local/genome/bin/consed -ace reference.ace -addReads alignmentFiles090603_134426.fof -chem 454 -addReads will be run. no ~/.consedrc file so no user resources will be used--that's ok no ./.consedrc file so no project-specific resources--that's ok couldn't open readOrder.txt--that's ok 50% done. 1 reads read so far... Now setting quality values opening ../phdball_dir/phd.ball.1 read phd files in ../phdball_dir/phd.ball.1 found: 1 totals: used: 0 need: 1 read phd files in ../phdball_dir/phd.ball.1 found: 2 totals: used: 0 need: 1 read phd files in ../phdball_dir/phd.ball.1 found: 3 totals: used: 0 need: 1 read phd files in ../phdball_dir/phd.ball.1 found: 4 totals: used: 0 need: 1 read phd files in ../phdball_dir/phd.ball.1 found: 5 totals: used: 0 need: 1 read phd files in ../phdball_dir/phd.ball.1 found: 6 totals: used: 0 need: 1 read phd files in ../phdball_dir/phd.ball.1 found: 7 totals: used: 0 need: 1 read phd files in ../phdball_dir/phd.ball.1 found: 8 totals: used: 0 need: 1 read phd files in ../phdball_dir/phd.ball.1 found: 9 totals: used: 0 need: 1 read phd files in ../phdball_dir/phd.ball.1 found: 1000 totals: used: 0 need: 1 read phd files in ../phdball_dir/phd.ball.1 found: 2000 totals: used: 0 need: 1 read phd files in ../phdball_dir/phd.ball.1 found: 3000 totals: used: 0 need: 1 read phd files in ../phdball_dir/phd.ball.1 found: 4000 totals: used: 0 need: 1 read phd files in ../phdball_dir/phd.ball.1 found: 5000 totals: used: 0 need: 1 read phd files in ../phdball_dir/phd.ball.1 found: 6000 totals: used: 0 need: 1 read phd files in ../phdball_dir/phd.ball.1 found: 7000 totals: used: 0 need: 1 read phd files in ../phdball_dir/phd.ball.1 found: 8000 totals: used: 0 need: 1 read phd files in ../phdball_dir/phd.ball.1 found: 9000 totals: used: 0 need: 1 read phd files in ../phdball_dir/phd.ball.1 found: 10,000 totals: used: 0 need: 1 read phd files in ../phdball_dir/phd.ball.1 found: 20,000 totals: used: 0 need: 1 read phd files in ../phdball_dir/phd.ball.1 found: 30,000 totals: used: 0 need: 1 read phd files in ../phdball_dir/phd.ball.1 found: 40,000 totals: used: 0 need: 1 read phd files in ../phdball_dir/phd.ball.1 found: 50,000 totals: used: 0 need: 1 read phd files in ../phdball_dir/phd.ball.1 found: 60,000 totals: used: 0 need: 1 read phd files in ../phdball_dir/phd.ball.1 found: 70,000 totals: used: 0 need: 1 read phd files in ../phdball_dir/phd.ball.1 found: 80,000 totals: used: 0 need: 1 read phd files in ../phdball_dir/phd.ball.1 found: 90,000 totals: used: 0 need: 1 read phd files in ../phdball_dir/phd.ball.1 found: 100,000 totals: used: 0 need: 1 Number of phd blocks used from ../phdball_dir/phd.ball.1: 0 exception thrown: RatReninRegion has no phd file ace file: RatReninRegion.ace Version 19.0 (090206) RatReninRegion has no phd file Version 19.0 (090206) ace file: RatReninRegion.ace Number of individual phd files read: 0 Total reads in assembly: 1 Finished setting quality values in 3 seconds total errors on consed startup: 1 now saving assembly... 3 writing ./RatReninRegion.ace.1 See new ace file RatReninRegion.ace.1 done 0 0.0 minutes cross_match and fasta time 0.1 minutes consed time 0.1 minutes total time Again, any assistance much appreciated, Liz |
![]() |
![]() |
![]() |
Tags |
454, assembly, bioinformatics, finishing, genome |
Thread Tools | |
|
|