How to set the --numBamFiles and --numSamples parameters for the fourth step in a dindel analysis of pooled samples.
I have 27 individuals sequenced. Data for each individual is contained in one BAM file. I have successfully completed the first three stages for the 27 BAM files. I now have 807 glf files for each of the 27 individuals.
However, how call the mergeOutputPooled.py script? Specifically what are the appropriate values of --numSamples and --numBamFiles. If I set the latter to the number of BAM files actually analyzed (27) the program quickly crashes. Running the program with --numSamples 27 and --numBamFiles 1 seems to work but is this the right thing to do? The parameters appear to be largely undocumented. In the end the run with --numBamFiles 1 ended with a vcf file that only contained a header and no results.
The actual error message was:
Reading /.../JERCANM000000.../dindel/stage3.100.glf.txt
An error occurred!
Traceback (most recent call last):
File "/opt/.../mergeOutputPooled.py", line 620, in <module>
main(sys.argv[1:])
File "/opt/.../mergeOutputPooled.py", line 613, in main
processPooledGLFFiles(glfFilesFile = options.inputFiles, maxHPLen = options.maxHPLen, refFile = options.refFile, outputVCFFile = options.o utputFile, doNotFilterOnFR = (not options.filterFR), filterQual = int(options.filterQual), numSamples = int(options.numSamples), numBamFiles = int(options.numBAMFiles))
File "/opt/.../mergeOutputPooled.py", line 336, in processPooledGLFFiles
raise NameError('Inconsistent glf files! Is the number of BAM files correctly specified?')
NameError: Inconsistent glf files! Is the number of BAM files correctly specified?
I have 27 individuals sequenced. Data for each individual is contained in one BAM file. I have successfully completed the first three stages for the 27 BAM files. I now have 807 glf files for each of the 27 individuals.
However, how call the mergeOutputPooled.py script? Specifically what are the appropriate values of --numSamples and --numBamFiles. If I set the latter to the number of BAM files actually analyzed (27) the program quickly crashes. Running the program with --numSamples 27 and --numBamFiles 1 seems to work but is this the right thing to do? The parameters appear to be largely undocumented. In the end the run with --numBamFiles 1 ended with a vcf file that only contained a header and no results.
The actual error message was:
Reading /.../JERCANM000000.../dindel/stage3.100.glf.txt
An error occurred!
Traceback (most recent call last):
File "/opt/.../mergeOutputPooled.py", line 620, in <module>
main(sys.argv[1:])
File "/opt/.../mergeOutputPooled.py", line 613, in main
processPooledGLFFiles(glfFilesFile = options.inputFiles, maxHPLen = options.maxHPLen, refFile = options.refFile, outputVCFFile = options.o utputFile, doNotFilterOnFR = (not options.filterFR), filterQual = int(options.filterQual), numSamples = int(options.numSamples), numBamFiles = int(options.numBAMFiles))
File "/opt/.../mergeOutputPooled.py", line 336, in processPooledGLFFiles
raise NameError('Inconsistent glf files! Is the number of BAM files correctly specified?')
NameError: Inconsistent glf files! Is the number of BAM files correctly specified?
Comment