Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Lien
    Member
    • Dec 2009
    • 47

    Dindel stage4

    Hi all,

    I'm trying to detect indels from Exome capture paired-end reads from Illumina.
    I aligned my data with BWA and succesfully performed the 3 first stages of the Dindel-program (version 1.01). However, in step 4, I'm somewhat confused: do you first have to merge all files generated in stage 3 into one single file? And is this simply done by 'concatening'?

    When I tried without concatenating, I get several errors:
    ./mergeOutputDiploid.py --inputFiles sample.dindel_stage2_output_windows.txt --outputFile variantCalls.VCF --ref hg19.fa
    An error occurred!
    Traceback (most recent call last):
    File "./mergeOutputDiploid.py", line 351, in <module>
    main(sys.argv[1:])
    File "./mergeOutputDiploid.py", line 346, in main
    mergeOutput(glfFilesFile = options.inputFiles, sampleID = options.sampleID, maxHPLen = options.maxHPLen, refFile = options.refFile, vcfFile = options.outputFile, filterQual = int(options.filterQual))
    File "./mergeOutputDiploid.py", line 254, in mergeOutput
    fg = open(glfFilesFile,'r')
    IOError: [Errno 2] No such file or directory: 'LNCaP_gelukt_BWA.dindel_stage2_output_windows.txt'

    When I concatenated a few files into one single file, I still received an error:
    ./mergeOutputDiploid.py --inputFiles sample.dindel_stage2_outputfiles2.txt --outputFile variantCalls.VCF --ref hg19.fa
    WARNING: additional columns in line 1 of file sample.dindel_stage2_outputfiles2.txt were ignored
    File msg does not exist
    . Aborting.
    An error occurred!


    Anyone knows what I'm doing wrong?
    Thank a lot!
    Lien
  • skblazer
    Member
    • Feb 2009
    • 51

    #2
    I wrote absolute path of all window files from last stage in a text file as --inputFiles,

    /your/path/xxxxwindow1.txt
    /your/path/xxxxwindow2.txt
    /your/path/xxxxwindow3.txt
    ....

    That will work.

    Comment

    • Lien
      Member
      • Dec 2009
      • 47

      #3
      Oh, I interpreted it wrong, I thought they meant the content of those files.

      It works fine now.
      Thanks!

      Comment

      • fitzgeraldlm
        Junior Member
        • Jan 2011
        • 4

        #4
        Dindel Stage 4 problems

        Hi,

        I'm having similar problems to Lien, however mine are unfortunately not solved yet I'm not sure if I understand your solution skblazer or the original instructions (very new to unix/linux).

        Currently I am typing in all of the x.dindel_stage2_output files after --inputFiles. So for example if I have three output files I would type

        python mergeOutputDiploid.py --inputFiles x.dindel_stage2_output.1.glf.txt x.dindel_stage2_output.2.glf.txt x.dindel_stage2_output.3.glf.txt --outputFile indel.VCF --ref hg18.fa

        I get the same message as Lien did when:
        WARNING: additional columns in line 1 of file x.dindel_stage2_output.1.glf.txt were ignored
        File msg does not exist
        . Aborting.
        An error occurred!

        I tried putting in the whole path name before each sample file thinking that was what you were suggesting skblazer, but it came back with the same message. Have I missed something? Am I supposed to combine all 3 files together beforehand? Is this what you are saying skblazer?

        thanks!

        Comment

        • skblazer
          Member
          • Feb 2009
          • 51

          #5
          You need create a file, for example "files.txt".
          In this file, you should write the following lines:
          /your/path/x.dindel_stage2_output.1.glf.txt
          /your/path/x.dindel_stage2_output.2.glf.txt
          /your/path/x.dindel_stage2_output.3.glf.txt

          Then you type the command:
          python mergeOutputDiploid.py --inputFiles files.txt --outputFile indel.VCF --ref hg18.fa

          That'll work.

          Originally posted by fitzgeraldlm View Post
          Hi,

          I'm having similar problems to Lien, however mine are unfortunately not solved yet I'm not sure if I understand your solution skblazer or the original instructions (very new to unix/linux).

          Currently I am typing in all of the x.dindel_stage2_output files after --inputFiles. So for example if I have three output files I would type

          python mergeOutputDiploid.py --inputFiles x.dindel_stage2_output.1.glf.txt x.dindel_stage2_output.2.glf.txt x.dindel_stage2_output.3.glf.txt --outputFile indel.VCF --ref hg18.fa

          I get the same message as Lien did when:
          WARNING: additional columns in line 1 of file x.dindel_stage2_output.1.glf.txt were ignored
          File msg does not exist
          . Aborting.
          An error occurred!

          I tried putting in the whole path name before each sample file thinking that was what you were suggesting skblazer, but it came back with the same message. Have I missed something? Am I supposed to combine all 3 files together beforehand? Is this what you are saying skblazer?

          thanks!

          Comment

          • gaffa
            Member
            • Oct 2010
            • 82

            #6
            fitzgeraldlm,

            The argument to --inputFiles should be the name of a single text file, this text file in turn containing all the names of the output files. So if your example, you would create a new file with the following content:

            Code:
            x.dindel_stage2_output.1.glf.txt
            x.dindel_stage2_output.2.glf.txt
            x.dindel_stage2_output.3.glf.txt
            That is, the literal names of the output files (you don't need absolute paths). Presumably the rationale behind this is that some runs can generate a very large number of output files, and so it would get difficult to specify them all on the command line. So instead you write all the file names to a text-file, and then the Dindel script looks into this text file. You can generate the text file either manually if you have a small number of output files or by a command like ls | grep ".glf.txt" > list_of_output_files.txt or similar (and then you'd specify --inputFiles list_of_output_files.txt)

            EDIT: beaten by skblazer ;]
            Last edited by gaffa; 01-07-2011, 05:36 PM.

            Comment

            • fitzgeraldlm
              Junior Member
              • Jan 2011
              • 4

              #7
              Solved

              A big thank you to gaffa and skblazer. I have now got Stage 4 running and have a VCF file! Thanks for the tip on how to create the txt file gaffa. I did have to add the whole path name, like you suggested skblazer. This may have something to do with the way I installed (or didn't correctly install) Dindel.

              Thanks!

              Comment

              • ndiaye
                Junior Member
                • May 2011
                • 3

                #8
                Hi I am still facing a similar problem:
                I've successfully went through first three stages of dindel variant calling, but getting the following error message when using mergeOutputPooled.py to generate the final vcf file. Note that cases_A.gene.ABCA1.glf.txt contain the name of my 10 glf.txt files.
                Thank you for helping me to fix that.

                [ndiayea@topaz] /shares/data/illumina_datastore/MI_20100215/analyses/Indels_calling/BAM_files/vcf_cases $ python /shares/home/ndiayea/programs/dindel-1.01-python/mergeOutputPooled.py --inputFiles ABCA1_cases_outputfiles.txt --outputFile ABCA1_cases_variantCalls.VCF --ref /shares/data/genome_datastore/homo_sapiens/Homo_sapiens_assembly18.fasta --numSamples 500 --numBamFiles 10
                Reading cases_A.gene.ABCA1.glf.txt
                An error occurred!
                Traceback (most recent call last):
                File "/shares/home/ndiayea/programs/dindel-1.01-python/mergeOutputPooled.py", line 620, in ?
                main(sys.argv[1:])
                File "/shares/home/ndiayea/programs/dindel-1.01-python/mergeOutputPooled.py", line 613, in main
                processPooledGLFFiles(glfFilesFile = options.inputFiles, maxHPLen = options.maxHPLen, refFile = options.refFile, outputVCFFile = options.outputFile, doNotFilterOnFR = (not options.filterFR), filterQual = int(options.filterQual), numSamples = int(options.numSamples), numBamFiles = int(options.numBAMFiles))
                File "/shares/home/ndiayea/programs/dindel-1.01-python/mergeOutputPooled.py", line 336, in processPooledGLFFiles
                raise NameError('Inconsistent glf files! Is the number of BAM files correctly specified?')
                NameError: Inconsistent glf files! Is the number of BAM files correctly specified?

                Comment

                • clarissaboschi
                  Member
                  • Apr 2010
                  • 63

                  #9
                  I am having the same problem but I am using the option for pooled samples, I am not sure what to put in the numSamples, is the number of samples/individuals in one of my bam files?
                  In the outputFiles.txt I have the list of my 78 files.

                  my command line is
                  ./dindel-1.01-linux-64bit mergeOutputPooled.py --inputFiles outputFiles.txt --outputFile variantCalls_hy7.VCF --ref chick.fa --numSamples 50 --numBamFiles 1

                  The error message was only
                  Error parsing input options

                  But I think my options are correct, what could be the problem?

                  Thanks

                  Comment

                  • clarissaboschi
                    Member
                    • Apr 2010
                    • 63

                    #10
                    I solved my problem to run the stage 4, the input file (unique text file) should have the same name as the other files.

                    I am using numSamples of 10, this is the number of individuals in my pool, but there is no explanation in the manual about it.

                    But now I am having a vcf file empty, and of course I dont know what is the problem. Anyway the Dindel software is so difficult to run, there are so many steps!

                    Comment

                    Latest Articles

                    Collapse

                    • GATTACAT
                      Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                      by GATTACAT
                      Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
                      07-01-2026, 11:43 AM
                    • SEQadmin2
                      Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                      by SEQadmin2


                      I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                      Here are nine questions we think about, in roughly the order they matter, before...
                      06-18-2026, 07:11 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by SEQadmin2, 07-02-2026, 11:08 AM
                    0 responses
                    7 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-30-2026, 05:37 AM
                    0 responses
                    12 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-26-2026, 11:10 AM
                    0 responses
                    20 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-17-2026, 06:09 AM
                    0 responses
                    54 views
                    0 reactions
                    Last Post SEQadmin2  
                    Working...