I am currently attempting to write a program in Python (v2.7.6) and am using HTSeq to read in fastq files and then run my code on it. However my issue is that my code runs on the first read from each input file (pair end data so 2 input files) and then stops.
Does anyone know how to run code on read1 and iterate to read 2 etc?
Code is :
I tried putting aload of the code in a for loop however all that happened was it ran 1000 times due to that being how many lines were in each input file but ran on the first line each time and iterate through the reads.
Code:
Any help would be greatly appreciated.
Thanks,
Tom
Does anyone know how to run code on read1 and iterate to read 2 etc?
Code is :
HTML Code:
# Import functions import HTSeq import itertools import numpy from matplotlib import pyplot from itertools import groupby import operator #Open files output_file= open('output.fq', "w") Unmatched1= open ('Unmatched1.fq', "w") Unmatched2= open ('Unmatched2.fq', "w") # Counts lines in input file as 'count' for future use, if needed print "%d lines in your choosen file" % len(open("Real_test_1").readlines()) count = '%d' % len(open("Real_test_1").readlines()) print "Reads below" print count # Reads in files for use by HTSeq fastq_file1 = HTSeq.FastqReader( "Real_test_1", "phred") fastq_file2 = HTSeq.FastqReader( "Real_test_2", "phred") #Get rev_comp and rename seq1 for read in fastq_file2: rc2o = read for read in fastq_file2: rc2=read.get_reverse_complement() print rc2 for read in fastq_file1: rc1o = read for read in fastq_file1: rc1 = read print rc1 #Reverse seq2 again so can be matched rc2w = rc2[::-1] rc1u = rc1 while len(rc1u) > 20: slide_merge(rc1u, rc2w) rc1u = rc1u[1:] merging max(merging.iteritems(), key=operator.itemgetter(1))[0] highest = max(merging.iteritems(), key=operator.itemgetter(1))[0] highest len(highest) remove = len(highest) if remove > 8: rc1r = rc1[:-remove] rc3 = rc1r+rc2w rc3.write_to_fastq_file(output_file) else: rc1o.write_to_fastq_file(Unmatched1) rc2o.write_to_fastq_file(Unmatched2)
I tried putting aload of the code in a for loop however all that happened was it ran 1000 times due to that being how many lines were in each input file but ran on the first line each time and iterate through the reads.
Code:
HTML Code:
for read in fastq_file2: rc2o = read for read in fastq_file2: rc2=read.get_reverse_complement() print rc2 for read in fastq_file1: rc1o = read for read in fastq_file1: rc1 = read print rc1 #Reverse seq2 again so can be matched rc2w = rc2[::-1] rc1u = rc1 while len(rc1u) > 20: slide_merge(rc1u, rc2w) rc1u = rc1u[1:] merging max(merging.iteritems(), key=operator.itemgetter(1))[0] highest = max(merging.iteritems(), key=operator.itemgetter(1))[0] highest len(highest) remove = len(highest) if remove > 8: rc1r = rc1[:-remove] rc3 = rc1r+rc2w rc3.write_to_fastq_file(output_file) else: rc1o.write_to_fastq_file(Unmatched1) rc2o.write_to_fastq_file(Unmatched2)
Thanks,
Tom
Comment