![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Converting FASTA/qual file pair from 454 to FASTQ | oiiio | Bioinformatics | 9 | 01-01-2016 04:55 PM |
How to convert sra-lite format to fastq? | tbusch0000 | Bioinformatics | 23 | 08-21-2013 09:53 PM |
How convert multiple .sra files into .fastq in one go? | TuA | Bioinformatics | 5 | 05-27-2011 09:32 AM |
Split fastq to fasta and qual file? | ewilbanks | Bioinformatics | 8 | 01-07-2011 03:02 AM |
format problem:convert fastq to seq/qual file | anyone1985 | Bioinformatics | 1 | 04-10-2009 09:27 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: Alabama Join Date: Jun 2009
Posts: 48
|
![]()
Hi all,
I have downloaded some 454 and Illumina data from the NCBI SRA that is in .fastq format. Example 454 data: Code:
@SRR000072.1 ERBRDQF01EGP9U length=67 TAATGTGCTTTTCTATAGACAGTCCATTTTCAGGGATATTTTCCAAACTGTCTGGACTGTCTATAGA +SRR000072.1 ERBRDQF01EGP9U length=67 <?:<<<<;>=2"<<<<<<<:<;<;5<??7+<<<:';<<>=3#=7<:(<;<<<<@;;<;<<<:;<<<< Thank you! Kevin |
![]() |
![]() |
![]() |
#2 |
Member
Location: Umeå, Sweden Join Date: Apr 2009
Posts: 27
|
![]()
For explaination of differences within the fastq format take a look at this thread: http://seqanswers.com/forums/showthread.php?t=3271
I would recommend you to start looking at the fq_all2std.pl script in MAQ. Personally, I prefer to use Python for this kind of tasks. for more info see this short article: O|B|F News: Working with FASTQ files in Biopython when speed matters |
![]() |
![]() |
![]() |
#3 |
Peter (Biopython etc)
Location: Dundee, Scotland, UK Join Date: Jul 2009
Posts: 1,543
|
![]()
Any FASTQ file from the NCBI SRA seems to already be in the standard Sanger FASTQ format (even if originally from a Solexa/Illumina machine it has been converted). See:
http://dx.doi.org/10.1093/nar/gkp1137 As Andreas has suggested, you could use Biopython to do FASTQ -> QUAL and FASTQ -> FASTA, these can be done with trivial two line scripts using Biopython 1.52 or later: http://biopython.org/DIST/docs/tutor...ual-conversion |
![]() |
![]() |
![]() |
#4 | |
Peter (Biopython etc)
Location: Dundee, Scotland, UK Join Date: Jul 2009
Posts: 1,543
|
![]() Quote:
http://seqanswers.com/forums/showpos...99&postcount=9 |
|
![]() |
![]() |
![]() |
#5 |
Member
Location: Alabama Join Date: Jun 2009
Posts: 48
|
![]()
Thanks all! I will look into the biopython scripts.
|
![]() |
![]() |
![]() |
#6 |
Member
Location: Alabama Join Date: Jun 2009
Posts: 48
|
![]()
I have BioPython 1.49 and I found the utilities I need but I'm having trouble. I'm using Ubuntu Linux. I started python from a terminal while in the same directory as my desired input fastq file. I tried the tests recommended in the manual but I'm not sure if they are working correctly. import Bio returned nothing (normal, right?) but print Bio.__version__ returned a syntax error (as did print Bio.149, print Bio.1.49, etc). It wasn't clear to me what I should actually type. Is my version too old?
Here's the code I used and messages printed to standard output: Code:
>>> from Bio import SeqIO >>> SeqIO.convert("output.fasta", "fasta", "Biomphalaria_glabrata_454.fastq", "fastq") Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'module' object has no attribute 'convert' Thanks, Kevin |
![]() |
![]() |
![]() |
#7 |
Peter (Biopython etc)
Location: Dundee, Scotland, UK Join Date: Jul 2009
Posts: 1,543
|
![]()
Yes, Biopython 1.49 is too old. You need at least Biopython 1.51 for FASTQ support, and at least Biopython 1.52 for the Bio.SeqIO.convert function:
http://news.open-bio.org/news/2009/0...vert-function/ Which version of Ubuntu are you using? I'm guessing jaunty from this listing: http://packages.ubuntu.com/search?ke...thon-biopython I install Biopython from source on Ubuntu (I currently use Karmic, but used to use Dapper before that which is really old now). You need to install the build dependencies, for example the python-dev package which will include the header files like Python.h which you are currently missing. As described on http://biopython.org/wiki/Download#Ubuntu_or_Debian try this first: sudo apt-get build-dep python-biopython P.S. Once you have this installed, the Bio.SeqIO.convert function takes the input file and format then the output file and format. Your attempted example seems to have this the wrong way round. Last edited by maubp; 01-23-2010 at 06:43 AM. |
![]() |
![]() |
![]() |
#8 |
Member
Location: Behind you. Join Date: Feb 2012
Posts: 12
|
![]()
Here is a script that you can place in your bin/ directory:
Code:
#!/usr/bin/env python """ Convert single FASTAQ files to FASTA + QUAL file pairs http://seqanswers.com/forums/showthread.php?t=3730 You can use this script from the shell like this:: $ ./fastaq_to_fasta reads.fastq reads.fna reads.qual """ # The libraries we need # import sys, os from Bio import SeqIO # Get the shell arguments # fq_path = sys.argv[1] fa_path = sys.argv[2] qa_path = sys.argv[3] # Check that the path is valid # if not os.path.exists(fq_path): raise Exception("No file at %s." % fa_path) # Do it # SeqIO.convert(fq_path, "fastq", qa_path, "qual") SeqIO.convert(fq_path, "fastq", fa_path, "fasta") |
![]() |
![]() |
![]() |
Thread Tools | |
|
|