Seqanswers Leaderboard Ad

**nilshomer** · 08-20-2009, 12:51 PM

Originally posted by samt View Post

Is there a way to convert a fastq file back to the original csfasta and qual files?

Here's my overly complicated PERL script. Note, I assume that the FASTQ qualities are in Sanger format and that the sequence is in color space (i.e adaptor + color calls).

Code:

#!/bin/perl

use strict;
use warnings;

my $csfastq = shift;
die unless defined($csfastq);
my $csfasta = $csfastq; $csfasta =~ s/csfastq$/csfasta/;
die unless !($csfastq eq $csfasta);
my $qual = $csfastq; $qual =~ s/.csfastq$/_QV.qual/;
die unless !($csfastq eq $qual);

open(FHcsfastq, "$csfastq") || die;
open(FHcsfasta, ">$csfasta") || die;
open(FHqual, ">$qual") || die;
my $state = 0;
my ($n, $r, $q) = ("", "", "");
while(defined(my $line = <FHcsfastq>)) {
    chomp($line);
    if(0 == $state) {
        &print_out(\*FHcsfasta, \*FHqual, $n, $r, $q);
        $n = $line;
        $n =~ s/^\@/>/;
    }
    elsif(1 == $state) {
        $r = $line;
    }
    elsif(3 == $state) {
        $q = $line;
        # convert back from SANGER phred
        my $tmp_q = "";
        for(my $i=0;$i<length($q);$i++) {
            my $Q = ord(substr($q, $i, 1)) - 33;
            die unless (0 < $Q);
            if(0 < $i) {
                $tmp_q .= " ";
            }
            $tmp_q .= "$Q";
        }
        $q = $tmp_q;
    }
    $state = ($state+1)%4;
}
&print_out(\*FHcsfasta, \*FHqual, $n, $r, $q);
close(FHcsfasta);
close(FHcsfastq);
close(FHqual);

sub print_out {
    my ($FHcsfasta, $FHqual, $n, $r, $q) = @_;

    if(0 < length($n)) {
        print $FHcsfasta "$n\n$r\n";
        print $FHqual "$n\n$q\n";
    }
}

**samt** · 08-20-2009, 01:05 PM

Thanks Nils, what are the arguments this script takes?

**nilshomer** · 08-20-2009, 01:27 PM

Originally posted by samt View Post

Thanks Nils, what are the arguments this script takes?

Forgot to mention that. It takes in the *fastq filename as input. It automatically creates the output *csfasta and *_QV.qual files.

Nils

**samt** · 08-20-2009, 01:30 PM

I thought so from reading the code but i executed it and got the error:

Died at fastqtocs.pl line 9.

ran the command:
perl fastqtocs.pl SRR015251.fastq

**nilshomer** · 08-20-2009, 01:31 PM

Originally posted by samt View Post

I thought so from reading the code but i executed it and got the error:

Died at fastqtocs.pl line 9.

ran the command:
perl fastqtocs.pl SRR015251.fastq

Rename your file to *csfastq.

Nils

**samt** · 08-20-2009, 01:42 PM

Sorry to keep asking, I do appreciate your help..it crashed at:
Died at fastqtocs.pl line 34, <FHcsfastq> line 4.
for(my $i=0;$i<length($q);$i++) {
my $Q = ord(substr($q, $i, 1)) - 33;
--> die unless (0 < $Q);

From another post I read, is this a problem of negative qualities?

**nilshomer** · 08-20-2009, 02:03 PM

Originally posted by samt View Post

Sorry to keep asking, I do appreciate your help..it crashed at:
Died at fastqtocs.pl line 34, <FHcsfastq> line 4.
for(my $i=0;$i<length($q);$i++) {
my $Q = ord(substr($q, $i, 1)) - 33;
--> die unless (0 < $Q);

From another post I read, is this a problem of negative qualities?

It is, you could just replace the "die unless (0 < $Q);" with "if($Q < 0) { $Q = -1; }"...

I don't allow negetive qualities, though I guess they could be "missing".

**ACTGangster** · 09-11-2009, 11:59 AM

what about doing it backwards?

Is there any way to go from .qual and .csfasta to .fastq? I want to use my SOLiD data in NGS-Cell. .csfasta to .fasta is acceptable as well.

**nilshomer** · 09-11-2009, 05:35 PM

Both BFAST, BWA, and MAQ have solid2fastq scripts/programs.

Nils

**ACTGangster** · 09-12-2009, 04:05 PM

phew

I thought I was going to have to write one myself.

Thanks,
Austin.

**juan** · 10-28-2009, 09:00 AM

A question about the BFAST solid2fastq script:
The SOLiD reads I have use "." instead of "4" for "N" basecalls. These bases have a qual score of -1.
After running the script on my reads, all the "." remain as "4", and the "-1" values were converted to " (ASCII 34). Should I manually convert the "." in the sequences to 4, and convert the " qualities to ! (ASCII 33, quality 0 ) ?

Example:
@226_3_65
T...11..3.2..1020.2.13.0.1....0...332..322.1233..1 3
+
"""51"","*""4405",")'"2")"""")"""'5$""0),"2(*5 ""%+

came from
>226_3_65_F3
T...11..3.2..1020.2.13.0.1....0...332..322.1233..1 3
and
>226_3_65_F3
-1 -1 -1 20 16 -1 -1 11 -1 9 -1 -1 19 19 15 20 -1 11 -1 8 6 -1 17 -1 8 -1 -1 -1 -1 8 -1 -1 -1 6 20 3 -1 -1 15 8 11 -1 17 7 9 20 -1 -1 4 10

**juan** · 10-28-2009, 09:02 AM

should be?
@226_3_65
T44411443424410204241340414444044433244322412334413
+
!!!51!!,!*!!4405!,!)'!2!)!!!!)!!!'5$!!0),!2(*5!!%+

**nilshomer** · 10-28-2009, 09:59 AM

Originally posted by juan View Post

T4441144342441020424134041444404443324432241233441 3

Note the space between the last and second last color (a no-no).

**juan** · 10-29-2009, 08:13 AM

That's strange, the space between the 1 and the 3 at the end of the line is a bug in the FORUM code! When I tried to remove the space by clicking "edit", the space does not appear. It pops up during the posting. Look below for example:

4441144342441020424134041444404443324432241233441222222

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 27 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 30 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 26 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

fastq to csfasta and .qual

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News