SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Illumina/Solexa



Similar Threads
Thread Thread Starter Forum Replies Last Post
Updated How to convert .txt file to .bed .GFF or .BAR file format, forevermark4 Bioinformatics 2 06-30-2014 05:02 AM
Is there a BED file format validator? Does a BED file have to be sorted position? LauraSmith Bioinformatics 3 05-21-2013 11:54 AM
Regarding fastq file format gvivek Bioinformatics 2 09-02-2011 02:34 AM
amos file format kespen General 1 08-12-2011 08:18 AM
pileup file format Hena Bioinformatics 0 08-03-2011 03:30 AM

Reply
 
Thread Tools
Old 05-15-2009, 01:35 PM   #1
kylle345
Member
 
Location: Toronto

Join Date: Apr 2009
Posts: 10
Default File format

Hi,

I have a sequencing file that looks like this:

4_1_932_784 GGACAGTTTTTTCCAATTATGGAACGCCTGTTCCTG
4_1_829_103 GTCACTATCTCAGTCAAAATTTAAGAAAATTGACAT
4_1_450_206 GTGCTATATCCCTATATAACCTACCCATCCACCTTT
4_1_495_275 GTTGTGGGAAATTGGAGCGATAAGCGTGCTTCTTCC

It is different from the standard fastq format. Does anyone know what format this is called?

kylle345 is offline   Reply With Quote
Old 05-15-2009, 03:40 PM   #2
ECO
--Site Admin--
 
Location: SF Bay Area, CA, USA

Join Date: Oct 2007
Posts: 1,358
Default

Check this thread:

http://seqanswers.com/forums/showthread.php?t=418
ECO is offline   Reply With Quote
Old 05-19-2009, 05:10 PM   #3
kylle345
Member
 
Location: Toronto

Join Date: Apr 2009
Posts: 10
Default Hey thanks for the reply but..

But I only have the .seq file and not the .prb file.

Does anyone know how to only handle the .seq file?

thanks
kylle345 is offline   Reply With Quote
Old 05-19-2009, 07:50 PM   #4
ECO
--Site Admin--
 
Location: SF Bay Area, CA, USA

Join Date: Oct 2007
Posts: 1,358
Default

What do you want to do with the .seq file? Convert to fasta? fastq?
ECO is offline   Reply With Quote
Old 05-20-2009, 01:22 PM   #5
kylle345
Member
 
Location: Toronto

Join Date: Apr 2009
Posts: 10
Default

Hi sorry,

I want to convert to it a fastq file.
kylle345 is offline   Reply With Quote
Old 05-20-2009, 03:36 PM   #6
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Quote:
Originally Posted by kylle345 View Post
Hi sorry,

I want to convert to it a fastq file.
The *seq.txt files from my observations do not have qualities so you will have to make dummy quality values. For single end data, you could do something like:

Code:
awk '{printf("@%d:%d:%d:%d\n%s\n+\n", $1, $2, $3, $4, $5); 
for(i=0;i<length($5);i++) { printf("I"); }; 
printf("\n")}' 
s_1_0001_seq.txt
For paired end data, they concatenate the two reads so it is a little more complicated using awk but the above should get you started.
nilshomer is offline   Reply With Quote
Old 05-20-2009, 04:58 PM   #7
kylle345
Member
 
Location: Toronto

Join Date: Apr 2009
Posts: 10
Default so that will help me create a .prb file?

Hey thanks for the quick replies. So having a .seq file is not enough to make a fastq file so that awk line helps me create a .prb file from .seq?

then the combination of .seq and .prb can create a fastq?

thanks
kylle345 is offline   Reply With Quote
Old 05-20-2009, 05:35 PM   #8
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Quote:
Originally Posted by kylle345 View Post
Hey thanks for the quick replies. So having a .seq file is not enough to make a fastq file so that awk line helps me create a .prb file from .seq?

then the combination of .seq and .prb can create a fastq?

thanks
The .seq file does not store qualities, so the qualities will have not have any meaning. The above awk command will output in FASTQ format so you do not need to worry about .seq and .prb files.

If you have .qseq files (or .seq and .prb which you seem to be missing), then you can make a meaningful fastq file.
nilshomer is offline   Reply With Quote
Old 07-18-2009, 12:07 PM   #9
kylle345
Member
 
Location: Toronto

Join Date: Apr 2009
Posts: 10
Default Hi I tried the awk line but it does not place the sequences in the new file.

I tried awk '{printf("@%d:%d:%d:%d\n%s\n+\n", $1, $2, $3, $4, $5); for(i=0 file1.txt > file2.txt

the output file only contains

@1:0:0:0

+

@1:0:0:0

+

@1:0:0:0


Its missing the sequence in between the lines.... is there something missing?
kylle345 is offline   Reply With Quote
Old 07-18-2009, 12:20 PM   #10
nilshomer
Nils Homer
 
nilshomer's Avatar
 
Location: Boston, MA, USA

Join Date: Nov 2008
Posts: 1,285
Default

Quote:
Originally Posted by kylle345 View Post
I tried awk '{printf("@%d:%d:%d:%d\n%s\n+\n", $1, $2, $3, $4, $5); for(i=0 file1.txt > file2.txt

the output file only contains

@1:0:0:0

+

@1:0:0:0

+

@1:0:0:0


Its missing the sequence in between the lines.... is there something missing?
I must admit I am an author of the alignment program BFAST (free for academic use), which does have a "qseq2fastq.pl" perl script. It may be easier to rely on such a script.
nilshomer is offline   Reply With Quote
Old 07-18-2009, 12:35 PM   #11
kylle345
Member
 
Location: Toronto

Join Date: Apr 2009
Posts: 10
Default thanks

Hey,

I will check it out

Kyle
kylle345 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:42 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO