SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics
Similar Threads
Thread Thread Starter Forum Replies Last Post
sff files, fasta and fastq Feenix 454 Pyrosequencing 4 06-26-2014 05:43 AM
Generating SFF files Xterra 454 Pyrosequencing 8 10-31-2011 01:07 PM
creating Roche's sff files enna80 Bioinformatics 5 11-10-2010 05:30 AM
sff 454 files into fasta Peruano 454 Pyrosequencing 4 03-08-2010 01:21 PM
Difference between .sff files and .fq file edge Bioinformatics 3 10-04-2009 06:30 PM

Reply
 
Thread Tools
Old 01-08-2009, 12:51 AM   #1
Raj
Member
 
Location: UK

Join Date: Jan 2009
Posts: 15
Default Assembling .sff files from 454 and finishing

Hi can anybody suggest good assembly programs, other than Newbler and MIRA, which can use .sff files directly as an input file, not fasta.

Also, I have generated an .ace file from newbler which is not fully compatible with consed (I can open the file in consed but for some reason the contig number look different). Could anybody suggest good programs, which I can use to finish a 454 generated genome? something that will allow me to view the scaffolds and join or break where needed.
I've tried consed and staden, any others would be greatly appreciated!!


Thanks in advance!

Raj
Raj is offline   Reply With Quote
Old 01-16-2009, 02:21 AM   #2
Raj
Member
 
Location: UK

Join Date: Jan 2009
Posts: 15
Default

...I was informed yesterday that the new version of consed (v18) should now be fully compatible with 454 data.
Also, with proposed release of Gap5, this too should also resolve the incompatibility issues, many programs seem to have when trying to finish 454 generated data.

Using MIRA and Newbler, seem to be the best methods for assembling 454 data, so that the pair end data can be fully taken advantage of.

Finishing is still the bottleneck for which, i hope the new versions of Consed and Gap can resolve...
Raj is offline   Reply With Quote
Old 01-20-2009, 05:11 AM   #3
v_kisand
Member
 
Location: Eesti

Join Date: Jan 2009
Posts: 37
Default

yes, consed 18 is out for few weeks, you need update for phrap as well.
I did not have any problems with installation (32-bit Fedora 10)

anyway, it does not perform de novo assembly of 454 reads, right? however it reads Newbler .ssf files, and allows assemble 454 reads to the reference sequence.

please correct me when I am wrong...
v_kisand is offline   Reply With Quote
Old 01-20-2009, 10:10 AM   #4
sklages
Senior Member
 
Location: Berlin, DE

Join Date: May 2008
Posts: 628
Default

.. and it can directly read newbler created ace files. So if you like newbler, no problem.
Maybe it's a good starting point for finishing a (shotgun) project if there is no sanger
backbone.

A good alternative might be MIRA which writes a CAF file (which can be easily converted
to gap4). But gap4 might slow down if you have a huge dataset ...

For larger assemblies you might want to have a look at Celera Assembler which in our
hands makes a good job with sanger/454(FLX) hybrid assemblies in the bacterial genome
size range.

Just my 2p,
Sven
sklages is offline   Reply With Quote
Old 06-03-2009, 06:03 AM   #5
mjleaks
Junior Member
 
Location: Wisconsin

Join Date: Jan 2009
Posts: 6
Default assembly issues

Has anyone assembled 454 data with consed package version 19? I'm having some issues with reading of the .sff files and wondering if anyone has completed an assembly of 454 data (not using Roche software produced .ace files). I'm using "add454Reads.perl reference.ace sff.fof reference.fa", where the fof specifies the location and sff files to assembly, but although the script runs, I get an error "doesn't existile /shared/BNFinal/mapping/consed/sff_dir/FPDLD6P02.sff", and the 454 reads are not brought into the assembly; it basically assembles with only the reference sequence. Someone mentioned needing to update phrap, which I will look into, but any other thoughts on this?
Thanks,
Liz
mjleaks is offline   Reply With Quote
Old 06-03-2009, 08:50 AM   #6
sklages
Senior Member
 
Location: Berlin, DE

Join Date: May 2008
Posts: 628
Default

Hi Liz,

Quote:
Originally Posted by mjleaks View Post
Has anyone assembled 454 data with consed package version 19? I'm having some issues with reading of the .sff files and wondering if anyone has completed an assembly of 454 data (not using Roche software produced .ace files). I'm using "add454Reads.perl reference.ace sff.fof reference.fa", where the fof specifies the location and sff files to assembly, but although the script runs, I get an error "doesn't existile /shared/BNFinal/mapping/consed/sff_dir/FPDLD6P02.sff", and the 454 reads are not brought into the assembly; it basically assembles with only the reference sequence. Someone mentioned needing to update phrap, which I will look into, but any other thoughts on this?
Thanks,
Liz
Well. it seems that there is no /shared/BNFinal/mapping/consed/sff_dir/FPDLD6P02.sff .. have you checked the location of your SFF file(s)?

You should update to the current version of phrap, as cross_macch is updated as well. Phrap is not involved in the task of aligning 454 reads against your refseq; cross_match is used for that.

cheers,
Sven
sklages is offline   Reply With Quote
Old 06-03-2009, 10:55 AM   #7
mjleaks
Junior Member
 
Location: Wisconsin

Join Date: Jan 2009
Posts: 6
Unhappy

hi Sven. Thanks for the post. I checked that a few times to make sure I'm not going crazy and yes the sff file is where I specified in the fof. Here are the steps I'm following. Any help much appreocated:

1.Ran gsMapper (through UI) using the option to create a Complete consed folder

2.Deleted the .consedrc file that Newbler created in edit_dir (per v19 instructions)

3.Deleted the phd.ball link in edit_dir (per v19 instructions)

4.Checked that the current version of sff2scf is the one to be used. Type "sff2scf -v". It gives "080714"

5.Created an .ace file from appropriate fasta format reference sequence: fasta2Ace.perl reference.fa

6.Created a sff.fof containing the name of the appropriate sff files - used a single .sff file. The sff.fof therefore contains only the name of the .sff file FMAAUWB12.sff ; no path etc.. The sff.fof file is - located in edit_dir and from here the FMAAUWB12.sff file is in ../sff_dir

7.Add reads from edit_dir directory run: add454Reads.perl reference.ace sff.fof reference.fa

8.Get:
doesn't existile FMAAUWB12.sff
0.0 minutes to until done with alignments
now using alignments to add reads to ace file
executing: /usr/local/genome/bin/consed -ace reference.ace -addReads alignmentFiles090603_134426.fof -chem 454
-addReads will be run.
no ~/.consedrc file so no user resources will be used--that's ok
no ./.consedrc file so no project-specific resources--that's ok
couldn't open readOrder.txt--that's ok
50% done. 1 reads read so far...
Now setting quality values
opening ../phdball_dir/phd.ball.1
read phd files in ../phdball_dir/phd.ball.1 found: 1 totals: used: 0 need: 1
read phd files in ../phdball_dir/phd.ball.1 found: 2 totals: used: 0 need: 1
read phd files in ../phdball_dir/phd.ball.1 found: 3 totals: used: 0 need: 1
read phd files in ../phdball_dir/phd.ball.1 found: 4 totals: used: 0 need: 1
read phd files in ../phdball_dir/phd.ball.1 found: 5 totals: used: 0 need: 1
read phd files in ../phdball_dir/phd.ball.1 found: 6 totals: used: 0 need: 1
read phd files in ../phdball_dir/phd.ball.1 found: 7 totals: used: 0 need: 1
read phd files in ../phdball_dir/phd.ball.1 found: 8 totals: used: 0 need: 1
read phd files in ../phdball_dir/phd.ball.1 found: 9 totals: used: 0 need: 1
read phd files in ../phdball_dir/phd.ball.1 found: 1000 totals: used: 0 need: 1
read phd files in ../phdball_dir/phd.ball.1 found: 2000 totals: used: 0 need: 1
read phd files in ../phdball_dir/phd.ball.1 found: 3000 totals: used: 0 need: 1
read phd files in ../phdball_dir/phd.ball.1 found: 4000 totals: used: 0 need: 1
read phd files in ../phdball_dir/phd.ball.1 found: 5000 totals: used: 0 need: 1
read phd files in ../phdball_dir/phd.ball.1 found: 6000 totals: used: 0 need: 1
read phd files in ../phdball_dir/phd.ball.1 found: 7000 totals: used: 0 need: 1
read phd files in ../phdball_dir/phd.ball.1 found: 8000 totals: used: 0 need: 1
read phd files in ../phdball_dir/phd.ball.1 found: 9000 totals: used: 0 need: 1
read phd files in ../phdball_dir/phd.ball.1 found: 10,000 totals: used: 0 need: 1
read phd files in ../phdball_dir/phd.ball.1 found: 20,000 totals: used: 0 need: 1
read phd files in ../phdball_dir/phd.ball.1 found: 30,000 totals: used: 0 need: 1
read phd files in ../phdball_dir/phd.ball.1 found: 40,000 totals: used: 0 need: 1
read phd files in ../phdball_dir/phd.ball.1 found: 50,000 totals: used: 0 need: 1
read phd files in ../phdball_dir/phd.ball.1 found: 60,000 totals: used: 0 need: 1
read phd files in ../phdball_dir/phd.ball.1 found: 70,000 totals: used: 0 need: 1
read phd files in ../phdball_dir/phd.ball.1 found: 80,000 totals: used: 0 need: 1
read phd files in ../phdball_dir/phd.ball.1 found: 90,000 totals: used: 0 need: 1
read phd files in ../phdball_dir/phd.ball.1 found: 100,000 totals: used: 0 need: 1
Number of phd blocks used from ../phdball_dir/phd.ball.1: 0
exception thrown: RatReninRegion has no phd file

ace file: RatReninRegion.ace
Version 19.0 (090206)
RatReninRegion has no phd file

Version 19.0 (090206)
ace file: RatReninRegion.ace
Number of individual phd files read: 0
Total reads in assembly: 1
Finished setting quality values in 3 seconds
total errors on consed startup: 1
now saving assembly... 3
writing ./RatReninRegion.ace.1
See new ace file RatReninRegion.ace.1
done 0
0.0 minutes cross_match and fasta time
0.1 minutes consed time
0.1 minutes total time

Again, any assistance much appreciated,
Liz
mjleaks is offline   Reply With Quote
Reply

Tags
454, assembly, bioinformatics, finishing, genome

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off



All times are GMT -8. The time now is 11:26 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2022, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO