View Single Post
Old 03-25-2016, 07:37 PM   #7
Brian Bushnell
Super Moderator
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707

Originally Posted by spark View Post
One thing I did note is that demuxbyname processes continued running after all of the reads had been demultiplexed, so I had to interrupt my pipeline and kill the processes manually before I could proceed to genotype calling.
That's interesting. Thanks for sharing! I'm not sure I'll be able to fix it, but I will look into it... were all of the output files complete and formatted correctly? It sounds to me like the program was waiting for some output stream to finish, which never happened, so I suggest verifying that the number of reads in all of the output files actually add up to the number reported. Since your output was gzipped, if anything didn't complete, the output file would be invalid. So -

cat L1.*.fq.gz | in=stdin.fq.gz pigz=f gzip=f unpigz=f gunzip=f

That will read the files, count the number of reads, and crash with some kind of error message if any of the gzipped files are corrupt (the pigz/gzip flags tell it to not spawn additional subprocesses for compression/decompression). I've never used demuxbyname with more than ~100 output files. Is pigz installed? You can check with the command "pigz"; if it is not installed, you'll get some variety of "command not found" message. For some reason I set demuxbyname to use pigz by default, which was a bad idea as in cases like this it will try to spawn 2,112 pigz processes (if it is installed) which could cause problems.
Brian Bushnell is offline   Reply With Quote