Hello all. For a few of my samples, I am getting this error that the reads aren't matching up between my two mates:
Here is the command that I used for reference:
It should be noted that these samples have been subsampled using this script that utilizes htseq to subsample mate pairs:
When I ran the above script it executed quite smoothly, but it seems to have had errors in a couple files. When I re-run the subsampling script through samples that get the above error when running through hisat2, it works fine. Any idea on this?
Code:
Error, fewer reads in file specified with -1 than in file specified with -2 libc++abi.dylib: terminating with uncaught exception of type int (ERR): hisat2-align died with signal 6 (ABRT)
Code:
hisat2 -q -p 8 -x /Volumes/cachannel/RNA_SEQ/Notch_RNASeq/in_silico_test/HISAT2_INDEXES/Xenopus_Laevis -1 /Volumes/cachannel/RNA_SEQ/Notch_RNASeq/in_silico_test/25\%/Sample_10/10_TAGCTT_L005_R1_001.fastq_sub_sample_0.25,/Volumes/cachannel/RNA_SEQ/Notch_RNASeq/in_silico_test/25\%/Sample_10/10_TAGCTT_L006_R1_001.fastq_sub_sample_0.25,/Volumes/cachannel/RNA_SEQ/Notch_RNASeq/in_silico_test/25\%/Sample_10/10_TAGCTT_L007_R1_001.fastq_sub_sample_0.25,/Volumes/cachannel/RNA_SEQ/Notch_RNASeq/in_silico_test/25\%/Sample_10/10_TAGCTT_L008_R1_001.fastq_sub_sample_0.25 -2 /Volumes/cachannel/RNA_SEQ/Notch_RNASeq/in_silico_test/25\%/Sample_10/10_TAGCTT_L005_R2_001.fastq_sub_sample_0.25,/Volumes/cachannel/RNA_SEQ/Notch_RNASeq/in_silico_test/25\%/Sample_10/10_TAGCTT_L006_R2_001.fastq_sub_sample_0.25,/Volumes/cachannel/RNA_SEQ/Notch_RNASeq/in_silico_test/25\%/Sample_10/10_TAGCTT_L007_R2_001.fastq_sub_sample_0.25,/Volumes/cachannel/RNA_SEQ/Notch_RNASeq/in_silico_test/25\%/Sample_10/10_TAGCTT_L008_R2_001.fastq_sub_sample_0.25 -S Sample_10_hisat_results.sam
Code:
tophat_folders = [os.path.join(root, name) for root, dirs, files in os.walk(os.getcwd()) #does for the current directory in the shell!!!! for name in files if name.endswith(".fastq")] #within the sample file, need to go into results folder to get the .fastq file for files in tophat_folders: print(files) fraction = float(.1) print("fastq files being ran through") for v, w in zip(tophat_folders[::2], tophat_folders[1::2]): in1 = v in2 = w print("Mate 1 is", v) print("Mate 2 is", w) iter1 = iter( HTSeq.FastqReader( in1 ) ) iter2 = iter( HTSeq.FastqReader( in2 ) ) output1 = in1 + "_sub_sample_" + str(fraction) output2 = in2 + "_sub_sample_" + str(fraction) out1 = open( output1, "w" ) out2 = open( output2, "w" ) for read1, read2 in itertools.izip( iter1, iter2 ): if random.random() < fraction: read1.write_to_fastq_file( out1 ) read2.write_to_fastq_file( out2 ) out1.close() out2.close()
Comment