![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
sff_extract problems | clostridium40 | Bioinformatics | 2 | 07-21-2011 06:08 AM |
Issues in running sff_extract | yy01 | Bioinformatics | 3 | 09-23-2010 08:58 AM |
sff_extract question | veena | Bioinformatics | 1 | 06-20-2010 11:31 PM |
Error using sff_extract. | Mona | 454 Pyrosequencing | 8 | 03-23-2010 07:59 AM |
sff_extract Syntax Error | Cresten | Bioinformatics | 3 | 12-21-2009 01:16 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Junior Member
Location: netherlands Join Date: Apr 2009
Posts: 1
|
![]()
being a newbie, I have a very simple question:
I am trying to convert sff formatted 454 data into fasta, fasta quality files and an xml files using sff_extract in python2.6 using the following command: sff_extract -s proj_in.454.fasta -q proj_in.454.fasta.qual -x proj_traceinfo_in.454.xml FX3UAMY01.sff this results in a syntax error. Does anyone have a suggestion what's wrong? Thanks, Veronique |
![]() |
![]() |
![]() |
#2 |
Junior Member
Location: Australia Join Date: Sep 2008
Posts: 8
|
![]()
Hi,
AFAIK you only use the -s, -q or -x options if you want only one of the files. To get all three you only need to use the -o option to set the output name (if you want) and then the names of the sff files. You have to manually rename the xml file afterwards however. e.g "sff_extract -o proj_in.454" should output "proj_in.454.fasta, proj_in.454.fasta.qual and proj_in.454.xml" |
![]() |
![]() |
![]() |
#3 |
Senior Member
Location: The University of Melbourne, AUSTRALIA Join Date: Apr 2008
Posts: 275
|
![]()
I just use the "sffinfo" tool that 454 distribute.
sffinfo -s file.sff > file.fasta sffinfo -q file.sff > file.qual sffinfo -m file.sff > file.manifest.xml Here is the usage: Usage: sffinfo [options...] [- | sfffile] [accno...] Options: -s or -seq Output just the sequences -q or -qual Output just the quality scores -f or -flow Output just the flowgrams -t or -tab Output the seq/qual/flow as tab-delimited lines -n or -notrim Output the untrimmed sequence or quality scores -m or -mft Output the manifest text |
![]() |
![]() |
![]() |
#4 |
Member
Location: Valencia, Spain Join Date: Aug 2009
Posts: 70
|
![]()
Which is the error message?
|
![]() |
![]() |
![]() |
#5 |
Member
Location: South East Asia Join Date: Nov 2008
Posts: 44
|
![]()
Is there a way I can download "sffinfo". I tried to search google but of no avail.
|
![]() |
![]() |
![]() |
#6 |
Member
Location: Valencia, Spain Join Date: Aug 2009
Posts: 70
|
![]()
I think sffinfo is not free software.
|
![]() |
![]() |
![]() |
#7 |
Junior Member
Location: Australia Join Date: Sep 2008
Posts: 8
|
![]()
I think you can only obtain sffinfo as part of the sfftools package that is supplied by Roche to groups that have the 454 sequencing machines. I'm pretty sure that this was the basis behind the production of sff_extract as many people cannot easily get access to the 454 software to extract the 454 reads from sff files.
|
![]() |
![]() |
![]() |
#8 |
Peter (Biopython etc)
Location: Dundee, Scotland, UK Join Date: Jul 2009
Posts: 1,543
|
![]()
However, Roche are generally relaxed about giving end users access, see:
http://seqanswers.com/forums/showthread.php?t=114 Note that the Newbler tools (sffinfo, plus the assembler and read mapper) are for Linux only. |
![]() |
![]() |
![]() |
#9 |
Member
Location: USA Join Date: Nov 2012
Posts: 51
|
![]()
I am new to this. I want to convert a iontorrent sff to fastq and xml for assembly with mira.
i used the commands: sff_extract -s sample_in.iontor.fastq -x sample_traceinfo_in.iontor.xml sample.sff and all the time getting the error msg usage: sff_extract [-h] [-o OUTPUT] [-c] [--min_left_clip MIN_LEFT_CLIP] [--max_percentage MAX_PERCENT] [--version] [input [input ...]] sff_extract: error: unrecognized arguments: -s -x sample_traceinfo_in.iontor.xml sample.sff any help??? |
![]() |
![]() |
![]() |
#10 |
Member
Location: Valencia, Spain Join Date: Aug 2009
Posts: 70
|
![]()
It seems that you are using the new sff_extract included in the seq_crumbs package. This new sff_extract has a new interface and it has changed its behaviour a little. I recommend you to use this new version, just change the parameters accordingly.
The support for the xml file has been removed, because it is not required any more by MIRA. If you are sure that you need it for the latest MIRA versions just let me know and we'll do something. |
![]() |
![]() |
![]() |
#11 |
Member
Location: USA Join Date: Nov 2012
Posts: 51
|
![]()
Thank you for your suggession but I am afraid that the latest version of mira (mira-3.4.1.1) do need both the xml and fastq format for denovo assembly, so in that case i have to use sff_extract to get the out put in xml format as well. can you help me??
|
![]() |
![]() |
![]() |
#12 | |
Member
Location: Valencia, Spain Join Date: Aug 2009
Posts: 70
|
![]()
From the Mira mailing list:
Quote:
|
|
![]() |
![]() |
![]() |
#13 |
Member
Location: USA Join Date: Nov 2012
Posts: 51
|
![]()
okk thnx for the info, i did not knw this...so only the fastq file is enough the run mira for a denovo?? if so, should i use "--notraceinfo" ?
|
![]() |
![]() |
![]() |
#14 |
Member
Location: Valencia, Spain Join Date: Aug 2009
Posts: 70
|
![]()
You would had to trim the reads before feeding them to Mira. I guess tha the latest version do not require the notraceinfo, but I recommend to you to ask in the Mira mailing list, because I haven't used Mira for a while.
|
![]() |
![]() |
![]() |
#15 |
Member
Location: USA Join Date: Nov 2012
Posts: 51
|
![]()
Dear Blanca,
I talked to the MIRA mailing list and Bastien recomends me to use the XML files as well for MIRA. He also informed that in the latest version of seq_crumbs package mate-splitting tools are also mising. Additionallly, yes MIRA can use clipped data btu for a better amd more accurate result XML is also required specially when working with Ion torrent data, Here's why "E.g.: assume a sequence without left clips, but with two right clips. The quality clip at 15, the adaptor clip at 52 (visualised with spaces in the next line) GTACGATCGAAAAA aaaaaaaaaaattttttttttttttttttttaaaaaa gtgtgtgtgt Now, Ion has sometimes interesting homopolymer artefacts (not homopolymer errors, but artefacts like above) and MIRA makes sure they get completely clipped. I.e., for MIRA the above sequence become to GTACGATCG aaaaaaaaaaaaaaaattttttttttttttttttttaaaaaa gtgtgtgtgt Note that one of the clips advanced by 5 bases to the left, clipping it another couple of bases and completely hiding the Ion artefact. Now, if people use clipped sequence only, all MIRA would see is: GTACGATCGAAAAA Note how the homopolymer artefact, which was not entirely clipped in the SFF, now still contributes with 5 A to the sequence, and here MIRA is absolutely unable to see the true nature of those (most of the time) totally wrong A bases." this was elustrated to me by Bastien Thanks chayan |
![]() |
![]() |
![]() |
#16 |
Member
Location: Valencia, Spain Join Date: Aug 2009
Posts: 70
|
![]()
Don't worry I'm on the thread too, we'll talk there. In the mean time you can use the old sff_extract.
Regards, Jose Blanca |
![]() |
![]() |
![]() |
#17 |
Member
Location: USA Join Date: Nov 2012
Posts: 51
|
![]()
Dear Blanca
i am very mush new to this. Every time i tried to down load the old sff_extract version, the whole page opened and i don't know what to do with this. What should i do??? |
![]() |
![]() |
![]() |
#18 |
Member
Location: Valencia, Spain Join Date: Aug 2009
Posts: 70
|
![]()
It's a python script, just run it with:
python sff_extract |
![]() |
![]() |
![]() |
#19 |
Member
Location: USA Join Date: Nov 2012
Posts: 51
|
![]()
Hii all,
dont know what to do with this error msg. getting this always when trying t run the non-seqcrumb version of sff_extract.. Traceback (most recent call last): File "/home/lab/Desktop/sff_extract_0_3_0", line 39, in <module> import tempfile File "/home/lab/qiime_software/python-2.7.1-release/lib/python2.7/tempfile.py", line 34, in <module> from random import Random as _Random File "/home/lab/qiime_software/python-2.7.1-release/lib/python2.7/random.py", line 49, in <module> import hashlib as _hashlib File "/home/lab/qiime_software/python-2.7.1-release/lib/python2.7/hashlib.py", line 136, in <module> globals()[__func_name] = __get_hash(__func_name) File "/home/lab/qiime_software/python-2.7.1-release/lib/python2.7/hashlib.py", line 74, in __get_builtin_constructor import _sha256 ImportError: No module named _sha256 any help??? Regards Chayan |
![]() |
![]() |
![]() |
#20 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,091
|
![]()
You may be missing the "libsasl2-dev" library. Depending on the flavor of linux you are using you will need to find and install it.
|
![]() |
![]() |
![]() |
Tags |
454, sff, sff_extract |
Thread Tools | |
|
|