SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   How to "chop" a large FASTQ PE file in half? (http://seqanswers.com/forums/showthread.php?t=77456)

cement_head 08-07-2017 06:26 AM

How to "chop" a large FASTQ PE file in half?
 
Hello,

I hesitate to use the following terms: "split", "partition", "trim" - because they all have special connotations.

What I'd like to do is to take a large FASTQ file of PE reads and cut the file into two approximately equal halves. However, I want to do it in such a manner that a given file does not have only one half of a pair of a PE read group. In other words, I want to insure that when the file is halved, or split, that no PE reads are separated.

For example, if I have a file with 11 PE reads, I do NOT want 5 reads in one file and 6 reads in another.

Will FASTQ Splitter work for what I want?

- Andor

GenoMax 08-07-2017 06:49 AM

Quote:

Originally Posted by cement_head (Post 209937)
Hello,

For example, if I have a file with 11 PE reads, I do NOT want 5 reads in one file and 6 reads in another.

- Andor

How would you split a PE dataset otherwise? I assume your PE reads are in two separate files (R1/R2) and the split would have to split both files?

cement_head 08-07-2017 08:29 AM

Quote:

Originally Posted by GenoMax (Post 209938)
How would you split a PE dataset otherwise? I assume your PE reads are in two separate files (R1/R2) and the split would have to split both files?

No, there are in ONE file, PE reads, RAW FASTQ.

GenoMax 08-07-2017 09:11 AM

So your paired-end reads are interleaved? Why not use "split -n 2" to divide the original into to two parts. If you have an odd number of fastq records then using an explicit "split -l ((n+1)/2*4)" may be better (n = number of fastq records).


All times are GMT -8. The time now is 08:19 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.