SEQanswers

Go Back   SEQanswers > Introductions



Similar Threads
Thread Thread Starter Forum Replies Last Post
Hybrid assembly of PacBio and Illumina allo Bioinformatics 3 05-01-2012 06:27 AM
454 + Illumina Combined Assembly Kmart Bioinformatics 9 11-08-2011 06:41 AM
Improving 454 assembly with Illumina clostridium40 454 Pyrosequencing 9 09-13-2011 09:17 AM
hybrid assembly Illumina/454 Robby Bioinformatics 1 09-01-2011 01:54 AM
Whole genome assembly of cDNA with Illumina paired-end sequencing mjouret Bioinformatics 4 04-15-2010 06:35 AM

Reply
 
Thread Tools
Old 02-02-2014, 05:40 PM   #1
paa6
Member
 
Location: south korea

Join Date: Feb 2014
Posts: 68
Default 454 sequencing file formats

I am trying to do assembly and the file format is .qual format. I have tried geneious and velvet format but they dont support .qual format, so I have tried to convert this file into fasta file and then performing the assembly. Geneious and velvet both accepting the file now but I am getting message that "NO CONTIGS IS FOUND" by geneious and contig file is empty in case of velvet. I dont know where I am doing wrong please reply me...

Last edited by paa6; 02-03-2014 at 04:49 PM.
paa6 is offline   Reply With Quote
Old 02-02-2014, 05:51 PM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,127
Default

Is this data from Illumina sequencing? Illumina data has been in FASTQ format (with some changes to the way the quality values were encoded) over the years.

Can you post a small example from a file you are referring to as being of .qual format?
GenoMax is offline   Reply With Quote
Old 02-02-2014, 06:29 PM   #3
paa6
Member
 
Location: south korea

Join Date: Feb 2014
Posts: 68
Default

ACTUALLY MY FILE IS IN .QUAL FORMAT...i guess in this file format sequnce is in numeric format...this is the file i have got from my PI. there is other file also in .sff format but first he wants to assemble .qual format file.

one thing I have understood while using geneious that u should have forward and reverse reads to perform assembly but in my case I have only one read file so I dont know how to assemble it...
paa6 is offline   Reply With Quote
Old 02-02-2014, 06:42 PM   #4
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,127
Default

You appear to have data from a SOLiD sequencer. I have no experience with data from a SOLiD instrument. Hopefully someone else on the forum will step in to help.

Meanwhile see some past threads related to this topic:
http://seqanswers.com/forums/showthread.php?t=3391
http://seqanswers.com/forums/showthread.php?t=15422

Last edited by GenoMax; 02-02-2014 at 06:46 PM.
GenoMax is offline   Reply With Quote
Old 02-02-2014, 08:54 PM   #5
paa6
Member
 
Location: south korea

Join Date: Feb 2014
Posts: 68
Default

THANK U SO MUCH...but is this mean that my sequence is not illumina sequencing?? then what kind of sequence is this..I mean what should I call it...I am looking for a name..
paa6 is offline   Reply With Quote
Old 02-03-2014, 12:08 AM   #6
sklages
Senior Member
 
Location: Berlin, DE

Join Date: May 2008
Posts: 628
Default

Well, it seems that you have a Roche 454 dataset. The "qual" file is only the quality portion of the dataset (missing the fna = sequence file). The "original" or "raw" file, containing all sequence information is the SFF file you have mentioned.

So either ask your PI to also provide the corresponding fasta (fna) sequence file or just create your sequence files by yourself from the SFF file using public available tools (sff_extract, SFF Tools from Roche).

But AFAIR "geneious" can directly import SFF files, which should get the most out of the data (and is probably the safest way).
sklages is offline   Reply With Quote
Old 02-03-2014, 04:29 AM   #7
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,127
Default

I guess this could also be a 454 data set (though why paa6 was not given the *.fna files is a mystery).

@paa6: In any case you should get additional information about this data set from your supervisor. At the least start with the SFF files and see if Geneious can import them.
GenoMax is offline   Reply With Quote
Old 02-03-2014, 05:14 AM   #8
paa6
Member
 
Location: south korea

Join Date: Feb 2014
Posts: 68
Default

i have only two types of files .sff and .qual format. After reading so much now I am also thinking that this is 454 sequencing files...I have tried to import .sff into geneious but it's showing error message.
paa6 is offline   Reply With Quote
Old 02-03-2014, 05:23 AM   #9
paa6
Member
 
Location: south korea

Join Date: Feb 2014
Posts: 68
Default

this is the msg I am getting...
Attached Images
File Type: png import failed.png (52.5 KB, 9 views)
paa6 is offline   Reply With Quote
Old 02-03-2014, 05:33 AM   #10
mastal
Senior Member
 
Location: uk

Join Date: Mar 2009
Posts: 666
Default

Try getting the sequences from the sff file using sff_extract.


http://bioinf.comav.upv.es/sff_extract/index.html

You can also convert the sff files to a text format, .sff.txt, so then you could have a look
at the first few lines to see if the data looks like it is what you expected.

I'm not sure if sff_extract or seq_crumbs will do that, though, or if you need the Roche software.
mastal is offline   Reply With Quote
Old 02-03-2014, 05:33 AM   #11
sklages
Senior Member
 
Location: Berlin, DE

Join Date: May 2008
Posts: 628
Default

Your dataset seems to be incomplete (missing fna, basic stats) .. ask your PI for information about the data and the complete dataset (fna/qual and sff file(s)).

How did you receive your data? USB stick/disk, download?
If it is a big SFF file you might want to check MD5 checksums (in case you have downloaded the data).

Concerning the error you get during import into geneious you should contact their support team.
sklages is offline   Reply With Quote
Old 02-03-2014, 05:41 AM   #12
paa6
Member
 
Location: south korea

Join Date: Feb 2014
Posts: 68
Default

@mastal I did try to change into .txt but sadly its not responding...not getting open...so I will try ur link...thanks...
paa6 is offline   Reply With Quote
Old 02-03-2014, 05:42 AM   #13
paa6
Member
 
Location: south korea

Join Date: Feb 2014
Posts: 68
Default

@sklaegs I got my data from usb and that was given by my supervisor...
paa6 is offline   Reply With Quote
Old 02-03-2014, 05:46 AM   #14
paa6
Member
 
Location: south korea

Join Date: Feb 2014
Posts: 68
Default

@skladge can u provide me any research paper mentioning that to assembl the 454 u need all these formats...so that I can show him proof. He was telling me that .qual is an Illumina sequencing file and .sff is 454 sequencing. So, obviously if i will try to explain him that they both are the same file of 454 sequencing and I need additional files too...so, I need to show him research paper...
paa6 is offline   Reply With Quote
Old 02-03-2014, 05:58 AM   #15
sklages
Senior Member
 
Location: Berlin, DE

Join Date: May 2008
Posts: 628
Default

Quote:
Originally Posted by paa6 View Post
@mastal I did try to change into .txt but sadly its not responding...not getting open...so I will try ur link...thanks...
Can you try to explain a bit more in detail? What did you do and what did not respond?
sklages is offline   Reply With Quote
Old 02-03-2014, 06:02 AM   #16
paa6
Member
 
Location: south korea

Join Date: Feb 2014
Posts: 68
Default

@sklages I have two types of reads files one is in .sff format and other one is .qual format. First I tried to converted .qual into .fasta and performed assembly using geneious and velvet both software but contigs didnt generate.

After discussing my problem in this forum, I came to conclusion that .qual is not an illumina seuencer file (according to my supervisor). So now I have tried to import .sff file into geneious to perform assembly but geneious is showing error message.
paa6 is offline   Reply With Quote
Old 02-03-2014, 06:03 AM   #17
sklages
Senior Member
 
Location: Berlin, DE

Join Date: May 2008
Posts: 628
Default

Quote:
Originally Posted by paa6 View Post
@skladge can u provide me any research paper mentioning that to assembl the 454 u need all these formats...so that I can show him proof. He was telling me that .qual is an Illumina sequencing file and .sff is 454 sequencing. So, obviously if i will try to explain him that they both are the same file of 454 sequencing and I need additional files too...so, I need to show him research paper...
can you paste the first few lines of the *qual* file so that we can see what format you have?

What is the output of:

Code:
strings yourFile.sff | grep run_type
strings yourFile.sff | grep instrument_model
Both should help us to determine what kind of files you have.
sklages is offline   Reply With Quote
Old 02-03-2014, 06:24 AM   #18
mastal
Senior Member
 
Location: uk

Join Date: Mar 2009
Posts: 666
Default

Quote:
Originally Posted by paa6 View Post
@skladge can u provide me any research paper mentioning that to assembl the 454 u need all these formats...so that I can show him proof. He was telling me that .qual is an Illumina sequencing file and .sff is 454 sequencing. So, obviously if i will try to explain him that they both are the same file of 454 sequencing and I need additional files too...so, I need to show him research paper...
see Lex Nederbragt's blog, starting with this page about sff files:

http://contig.wordpress.com/2010/10/...file/#more-207
mastal is offline   Reply With Quote
Old 02-03-2014, 06:45 AM   #19
paa6
Member
 
Location: south korea

Join Date: Feb 2014
Posts: 68
Default

@SKLADGE

This is the first few lines of .sff file... I opened this file using less command in linux.. .sff^@^@^@^A^@^@^@^@.<E9><E8><A0>^@I<9E><A4>^@^C<AE>;^CH^@^D^C ^ATACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGTACGT

This is the first few lines of .qual file...I opened this file using less command in linux... >F0ZETIM04H8105 length=79 xy=3266_2039 region=4 run=R_2009_08_18_04_59_32_
36 35 35 37 37 37 37 37 37 35 35 40 39 39 39 40 40 40 40 40 39 39 39 40 40 39 39 39 37 37 37 38 38 37 31 23 23 21 20 19 27 30 30
paa6 is offline   Reply With Quote
Old 02-03-2014, 06:46 AM   #20
paa6
Member
 
Location: south korea

Join Date: Feb 2014
Posts: 68
Default

Quote:
Originally Posted by mastal View Post
see Lex Nederbragt's blog, starting with this page about sff files:

http://contig.wordpress.com/2010/10/...file/#more-207
thanks I will go through this...blog..
paa6 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:37 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO