Seqanswers Leaderboard Ad

**kmcarr** · 02-21-2010, 02:04 PM

According to its documentation sff_extract (http://bioinf.comav.upv.es/sff_extract/index.html) can do this. I have used sff_extract but not on paired end data so I can't offer any first hand information.

**forevermark4** · 02-21-2010, 02:46 PM

Hi everyone

I just started to work with next generation sequencing data . I have following query : If you can provide me help to handle this kind of simulation and reassembling problems. How to generate reads from sequence. i have fasta file. I think we can go for maq toll for simulation. Nut not be able to work out.

To establish simulation of reassembling sequence from NGS data. This will build from re-assembling a simple sequence of 1 Mb with no repeats in the haploid state, to inclusion of genetic variation and polyploidy.
-simulate a NGS run from a 1 Mb segment of human with little/no repeats. Average fragment size 500 bp with normal distribution. Paired end with 75 bp reads. Assume perfect sequencing. Check out other simulation methods
- align the reads back to the 1 Mb sequence. How much variation in coverage
- reassemble the reads WITHOUT using the reference sequence.

Thanks

**themerlin** · 02-21-2010, 05:18 PM

I have had good luck with sff_extract. All you need is the linker sequence, insert length and insert length standard deviation. Then you run:

sff_extract -l linker.fasta yoursff.sff -i "insert_size:XXXX, insert_stdev:XXX" -o prefix

-Jason

**maven** · 02-22-2010, 07:53 AM

This can be done with 454 software too, although there are bound to be differences in the result based on the specifics of the linker-recognition algorithms.

runAssembly -tr -noa -no myfile.sff

It's not the friendliest of output in that it generates an assembly directory and a few extra files that are unneeded for this use case, but it gets the job done. I've done this with version 2.3, I don't know about earlier versions.

**pmiguel** · 02-22-2010, 08:50 AM

Originally posted by maven View Post

This can be done with 454 software too, although there are bound to be differences in the result based on the specifics of the linker-recognition algorithms.

runAssembly -tr -noa -no myfile.sff

It's not the friendliest of output in that it generates an assembly directory and a few extra files that are unneeded for this use case, but it gets the job done. I've done this with version 2.3, I don't know about earlier versions.

That looks like just what I want. Alas:

runAssembly -tr -noa -no GB71BC401.sff

gives me:

Error: Invalid option: -noa.
Usage: runAssembly [-o projdir] [-nrm] [-p (sfffile | [regionlist:]analysisDir)]... (sfffile | [regionlist:]analysisDir)...

I am running v. 2.3

--
Phillip

**westerman** · 02-22-2010, 09:05 AM

The closest option to '-noa' is '-noace' which skips the output of ACE files, etc.

**maven** · 02-22-2010, 09:09 AM

-noa is supposed to tell it to not actually bother doing the assembly itself. The -no option turns off most output generation, since the goal here is to just generate the split fasta (and qual) file. Both options are .... optional ... in the sense that once it gets past the first stage of the assembly you can manually kill it if you don't want to sit around waiting for an assembly to complete. The fasta file should still be there, as it's generated prior to actually starting the assembly.

**pmiguel** · 02-22-2010, 09:17 AM

Originally posted by maven View Post

-noa is supposed to tell it to not actually bother doing the assembly itself. The -no option turns off most output generation, since the goal here is to just generate the split fasta (and qual) file. Both options are .... optional ... in the sense that once it gets past the first stage of the assembly you can manually kill it if you don't want to sit around waiting for an assembly to complete. The fasta file should still be there, as it's generated prior to actually starting the assembly.

Alright! Leaving out the -noa worked. It did create a new assembly directory and do the assembly, but that didn't take long.

Thanks!
--
Phillip

Topics	Statistics	Last Post
Bacterial Timeline Study Suggests Oxygen Use Preceded Photosynthesis by seqadmin Started by seqadmin, Yesterday, 12:59 PM	0 responses 7 views 0 reactions	Last Post by seqadmin Yesterday, 12:59 PM
New Software Simplifies 3D Gene Expression Mapping by seqadmin Started by seqadmin, 04-02-2025, 10:17 AM	0 responses 8 views 0 reactions	Last Post by seqadmin 04-02-2025, 10:17 AM
AI Tool Creates High-Resolution 3D Maps of the Mouse Brain by seqadmin Started by seqadmin, 03-20-2025, 05:03 AM	0 responses 49 views 0 reactions	Last Post by seqadmin 03-20-2025, 05:03 AM
Studying Microbial Gene Transfer with RNA Barcoding by seqadmin Started by seqadmin, 03-19-2025, 07:27 AM	0 responses 60 views 0 reactions	Last Post by seqadmin 03-19-2025, 07:27 AM

Seqanswers Leaderboard Ad

How to extract paired-end reads from .sff 454?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News