SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
splitting big genbank file joscarhuguet Bioinformatics 9 05-20-2013 06:39 AM
splitting fastq? yaximik Bioinformatics 5 02-05-2013 07:12 PM
Software for splicing paired-end fastq files? BioHak Bioinformatics 4 04-11-2012 03:37 AM
Splitting concatenated PE fastq to two files for respect reads JayM Illumina/Solexa 5 11-05-2010 02:58 AM
Splitting 454 paired reads in a FASTQ file sjackman Bioinformatics 5 09-10-2010 11:09 AM

Reply
 
Thread Tools
Old 04-14-2013, 10:39 PM   #1
JahnDavik
Junior Member
 
Location: Norway

Join Date: Aug 2012
Posts: 8
Default splitting big paired fastq files

Hi there,
I do my 'bioinformatic' work in CLC. Now I sit with many (30) large files with paired end reads (~10GB each direction) and my computer is stalling if I'd try to use all in a de novo assembly. Hence, I am looking for a tool to split the files in, say, 4.
I am afraid I am not familiar with the linux world. So, I am lookiing for scripts (R preferably, or Perl) that would solve this?

Thank you.
jd
JahnDavik is offline   Reply With Quote
Old 04-14-2013, 10:59 PM   #2
swbarnes2
Senior Member
 
Location: San Diego

Join Date: May 2008
Posts: 912
Default

If you split your fastq, you aren't going to get a good assembly. You really want a computer with more memory, so it can handle the whole fatq.

If you really need to split it, use unix built-in programs.

Code:
split -l 40000000 myfastq.fq
should split it into separate files, each with 40,000,000 lines, or 10 million reads.
swbarnes2 is offline   Reply With Quote
Old 04-14-2013, 11:16 PM   #3
JahnDavik
Junior Member
 
Location: Norway

Join Date: Aug 2012
Posts: 8
Default

Thank you for your prompt reply!
There are 150-200 mill reads in each of the paired fastq files and I just expected that to be quite redundant.
JahnDavik is offline   Reply With Quote
Reply

Tags
big file, clc bio, fastq, splitting

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:31 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO