View Single Post
Old 12-16-2009, 06:15 AM   #3
Peter (Biopython etc)
Location: Dundee, Scotland, UK

Join Date: Jul 2009
Posts: 1,543

If you like Python, you could try something like this (untested):

HTML Code:
#This Python script requires Biopython 1.51 or later
from Bio import SeqIO
import itertools

#Setup variables (could parse command line args instead)
file_f = "s_1_1_sequence.txt"
file_r = "s_1_2_sequence.txt"
file_out = "interleaved.fastq"

def interleave(iter1, iter2) :
    for (forward, reverse) in itertools.izip(iter1,iter2):
        #Remove the /1 and /2 from the identifiers, =[:-2] =[:-2]
        assert ==
        yield forward
        yield reverse

records_f = SeqIO.parse(open(file_f,"rU"), "fastq-illumina")
records_r = SeqIO.parse(open(file_r,"rU"), "fastq-illumina")

handle = open(file_out, "w")
count = SeqIO.write(interleave(records_f, records_r), handle, "fastq-sanger")
print "%i records written to %s" % (count, file_out)
Based on the Biopython example here:

Note - I'm assuming you have Illumina 1.3+ FASTQ files, not Solexa style FASTQ files. See and or for search the forum for details.

Last edited by maubp; 12-16-2009 at 06:18 AM. Reason: Adding link
maubp is offline   Reply With Quote