SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
split fastq file Balat Bioinformatics 10 09-22-2016 08:55 AM
Split a SAM file rahul Bioinformatics 6 12-20-2011 12:12 PM
Split fastq to fasta and qual file? ewilbanks Bioinformatics 8 01-07-2011 03:02 AM
how to split BED file according to chromsome sunsnow86 Bioinformatics 4 11-30-2010 03:39 PM
Split GA FASTQ file aritakum Bioinformatics 3 06-10-2010 05:15 AM

Reply
 
Thread Tools
Old 04-13-2011, 10:17 AM   #1
lfaino
Junior Member
 
Location: ITA

Join Date: Mar 2011
Posts: 9
Default split a fastq file

Dear All,
I need to split a fastq file in two files in which there is one pair in one and the other pair in the other. It looks that CLC needs to files to select pair ends reads. the file now looks like this:

>ILLUMINA-52179E_0050_FC70G0HAAXX:6:1:2997:934#GCCAAT/1
TACCACCCAGGCCCCGTCTATCTATATCATCACTCGATTTATTATCCTCTAGTAATCCTCCCGAAATCCCTGAA
>ILLUMINA-52179E_0050_FC70G0HAAXX:6:1:2997:934#GCCAAT/1
quality line
>ILLUMINA-52179E_0050_FC70G0HAAXX:6:1:2997:934#GCCAAT/2
TCCTGAGTCAATTGCAGAGCAGTTTCATTTCTATGAGCATGATTCTTCGGCATAAAAGTCGAGCATGAACTATGT
>ILLUMINA-52179E_0050_FC70G0HAAXX:6:1:2997:934#GCCAAT/2
quality line

thanks in advance
Luigi
lfaino is offline   Reply With Quote
Old 04-13-2011, 05:22 PM   #2
Kennels
Senior Member
 
Location: Sydney

Join Date: Feb 2011
Posts: 149
Default

Galaxy has a tool which does this.
Visit: http://main.g2.bx.psu.edu/

On the left hand pane, go to:
NGS: QC and manipulation
Fastq splitter

and follow instructions.
Kennels is offline   Reply With Quote
Old 04-14-2011, 07:46 AM   #3
lfaino
Junior Member
 
Location: ITA

Join Date: Mar 2011
Posts: 9
Default

Hi Kennels,
it is not what I want. I have already the sequence dividend in forw and rev. I need to divide them in a file containing the /1 and in a file the /2. that`s it

Luigi
lfaino is offline   Reply With Quote
Old 04-14-2011, 09:42 AM   #4
swbarnes2
Senior Member
 
Location: San Diego

Join Date: May 2008
Posts: 912
Default

What's wrong with grep? Or some scripting language?

While you are at it, use sed to get rid of the text following the +, since it's just pointlessly making your file larger.
swbarnes2 is offline   Reply With Quote
Old 04-14-2011, 04:28 PM   #5
Kennels
Senior Member
 
Location: Sydney

Join Date: Feb 2011
Posts: 149
Default

Hi Luigi,

Unless I'm understanding you wrongly, you want to split a paired end fastq file into two files, one containing read 1, and the other read 2, correct?

Well, that's the tool I described in Galaxy:

From the page:
*****************************
What it does

Splits a single fastq dataset representing paired-end run into two datasets (one for each end). This tool works only for datasets where both ends have the same length.

Sequence identifiers will have /1 or /2 appended for the split left-hand and right-hand reads, respectively.
****************************************
Kennels is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:35 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO