Hi guys,
Thanks for this script! I used it successfully both on standard fastq as well as on casava format (with the appropriate changes in the script)
But now I hit a wall again:
I have been doing some error correction and this added some information to the header so afterwards the script does not work anymore.
my data looks now like this:
@PCUS-319-EAS487_0006_FC:7:1:2093:983#0/1 0 0 0 0 0 f: b:
NACAGAACTCATTTGGCAGGCAAAACCCTGAGACAGATTCTGACAGGAAGTGGATACCTGATGTGTTGTATTACCT
+PCUS-319-EAS487_0006_FC:7:1:2093:983#0/1
BGIFHIMLMM_______W_Y_____BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
@PCUS-319-EAS487_0006_FC:7:1:2415:988#0/1 0 2 0 0 0 f: b:39G/1T
TATAGACCCTTTTATCTATCATTTCTTCACAAGCTTAGGACAGAAGACTCTTATTGTGCATGATAAAGTAAAAGTC
+PCUS-319-EAS487_0006_FC:7:1:2415:988#0/1
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
I tried to change the script in different ways, but it always results basically in the following error message:
Traceback (most recent call last):
File "/usit/titan/u1/chrishah/python/pair_up_reads.py", line 49, in <module>
for title, seq, qual in FastqGeneralIterator(open(input_reverse_filename)):
File "/site/VERSIONS/python-2.6.2/lib/python2.6/site-packages/Bio/SeqIO/QualityIO.py", line 907, in FastqGeneralIterator
raise ValueError("Sequence and quality captions differ.")
ValueError: Sequence and quality captions differ.
I could of course just remove the parts from the header that have been added by the errorcorrection tool and use the normal script...
But I really think it should be easy to adjust the script so that it works for my data - I just dont know enough about Python...
Can anyone give me a hint how to change the script? Any tips are highly appreciated!
Much obliged!
Christoph
Thanks for this script! I used it successfully both on standard fastq as well as on casava format (with the appropriate changes in the script)
But now I hit a wall again:
I have been doing some error correction and this added some information to the header so afterwards the script does not work anymore.
my data looks now like this:
@PCUS-319-EAS487_0006_FC:7:1:2093:983#0/1 0 0 0 0 0 f: b:
NACAGAACTCATTTGGCAGGCAAAACCCTGAGACAGATTCTGACAGGAAGTGGATACCTGATGTGTTGTATTACCT
+PCUS-319-EAS487_0006_FC:7:1:2093:983#0/1
BGIFHIMLMM_______W_Y_____BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
@PCUS-319-EAS487_0006_FC:7:1:2415:988#0/1 0 2 0 0 0 f: b:39G/1T
TATAGACCCTTTTATCTATCATTTCTTCACAAGCTTAGGACAGAAGACTCTTATTGTGCATGATAAAGTAAAAGTC
+PCUS-319-EAS487_0006_FC:7:1:2415:988#0/1
BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
I tried to change the script in different ways, but it always results basically in the following error message:
Traceback (most recent call last):
File "/usit/titan/u1/chrishah/python/pair_up_reads.py", line 49, in <module>
for title, seq, qual in FastqGeneralIterator(open(input_reverse_filename)):
File "/site/VERSIONS/python-2.6.2/lib/python2.6/site-packages/Bio/SeqIO/QualityIO.py", line 907, in FastqGeneralIterator
raise ValueError("Sequence and quality captions differ.")
ValueError: Sequence and quality captions differ.
I could of course just remove the parts from the header that have been added by the errorcorrection tool and use the normal script...
But I really think it should be easy to adjust the script so that it works for my data - I just dont know enough about Python...
Can anyone give me a hint how to change the script? Any tips are highly appreciated!
Much obliged!
Christoph
Comment