Seqanswers Leaderboard Ad

**Richard Finney** · 08-15-2013, 08:01 AM

Googling/DuckDucking might have turned up the answer you are looking for.

Regardless, check this thread : http://seqanswers.com/forums/showthread.php?t=16505

Are your reads paired ?

**dpryan** · 08-15-2013, 08:09 AM

There are a few ways, some mentioned on this site and some over on biostars. One of those ought to work for you.

**angerusso** · 08-15-2013, 08:14 AM

Yes my data is paried end. Another complication is that the two pairs are of unequal size.

du -s command gives:
*_R1* = 64850642
*_R2* = 48640554

Originally posted by Richard Finney View Post

Googling/DuckDucking might have turned up the answer you are looking for.

Regardless, check this thread : http://seqanswers.com/forums/showthread.php?t=16505

Are your reads paired ?

**dpryan** · 08-15-2013, 08:17 AM

You're going to want to resync them before you do anything else. Google "paired-end fastq sync" for a plethora of solutions.

**angerusso** · 08-15-2013, 11:02 AM

So I ran the following perl script:

404 — Bitbucket

https://bitbucket.org/jesseerdmann/fairview-galaxy-dist/raw/8e6e7a346284c7c261c951998b0138e82f540f43/tools/msi/pe-sync-2-files.pl

and it says: "passed full check" using "quick" which means the two files are in SYNC.

"QUICK CHECK enabled
Casava 1.8 read id style
PASSED full check"

But before I use random selection of reads from the two files, following your google links, shouldn't I make them equal size? As R1 is bigger than R2, even through they are in sync, I assume they are in SYNC only for the reads size that's common between them. Am I right?

Originally posted by dpryan View Post

You're going to want to resync them before you do anything else. Google "paired-end fastq sync" for a plethora of solutions.

**dpryan** · 08-15-2013, 11:10 AM

I've never seen that perl script, so I can't say that it works correctly. If you follow the instructions from this thread on biostars (Pierre's comment first, followed by Steffi's), you'll get two synchronized files of the same size.

**dpryan** · 08-15-2013, 11:13 AM

I'll add, this sort of different number of reads in paired-end files issue usually only crops up when mates from a pair are trimmed separately. If that's the case here and you're the one that did the trimming, you're life will be easier if you use a different trimmer next time (trimmomatic and trim_galore are common choices).

**angerusso** · 08-15-2013, 11:24 AM

Ignore my previous msg. My files are same size when I used "du -b" command.

**dpryan** · 08-15-2013, 11:26 AM

As long as they're also the same when you use "wc -l" as well then things are OK.

Topics	Statistics	Last Post
TIGR Systems Offer a Compact Alternative to CRISPR for Gene Editing by seqadmin Started by seqadmin, 03-03-2025, 01:15 PM	0 responses 149 views 0 likes	Last Post by seqadmin 03-03-2025, 01:15 PM
Highlights from AGBT 2025 – Part II by seqadmin Started by seqadmin, 02-28-2025, 12:58 PM	0 responses 223 views 0 likes	Last Post by seqadmin 02-28-2025, 12:58 PM
Highlights from AGBT 2025 – Part I by seqadmin Started by seqadmin, 02-24-2025, 02:48 PM	0 responses 590 views 0 likes	Last Post by seqadmin 02-24-2025, 02:48 PM
Selecting the Right AI Model for Bioinformatics Research by seqadmin Started by seqadmin, 02-21-2025, 02:46 PM	0 responses 259 views 0 likes	Last Post by seqadmin 02-21-2025, 02:46 PM

Seqanswers Leaderboard Ad

Announcement

how to randomly select 20m reads out of a FASTQ file

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News