SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Using dindel Hena Bioinformatics 30 08-07-2013 11:50 AM
how to use dindel libiyagirl Bioinformatics 12 07-25-2012 03:04 AM
Dindel zhangtao13039 Bioinformatics 3 12-01-2011 11:34 AM
dindel question csoong Bioinformatics 0 02-25-2011 11:32 AM
Dindel --outputRealignedBAM fitzgeraldlm Bioinformatics 3 02-04-2011 06:33 AM

Reply
 
Thread Tools
Old 12-09-2010, 01:28 AM   #1
Lien
Member
 
Location: Leuven

Join Date: Dec 2009
Posts: 47
Default Dindel stage4

Hi all,

I'm trying to detect indels from Exome capture paired-end reads from Illumina.
I aligned my data with BWA and succesfully performed the 3 first stages of the Dindel-program (version 1.01). However, in step 4, I'm somewhat confused: do you first have to merge all files generated in stage 3 into one single file? And is this simply done by 'concatening'?

When I tried without concatenating, I get several errors:
./mergeOutputDiploid.py --inputFiles sample.dindel_stage2_output_windows.txt --outputFile variantCalls.VCF --ref hg19.fa
An error occurred!
Traceback (most recent call last):
File "./mergeOutputDiploid.py", line 351, in <module>
main(sys.argv[1:])
File "./mergeOutputDiploid.py", line 346, in main
mergeOutput(glfFilesFile = options.inputFiles, sampleID = options.sampleID, maxHPLen = options.maxHPLen, refFile = options.refFile, vcfFile = options.outputFile, filterQual = int(options.filterQual))
File "./mergeOutputDiploid.py", line 254, in mergeOutput
fg = open(glfFilesFile,'r')
IOError: [Errno 2] No such file or directory: 'LNCaP_gelukt_BWA.dindel_stage2_output_windows.txt'

When I concatenated a few files into one single file, I still received an error:
./mergeOutputDiploid.py --inputFiles sample.dindel_stage2_outputfiles2.txt --outputFile variantCalls.VCF --ref hg19.fa
WARNING: additional columns in line 1 of file sample.dindel_stage2_outputfiles2.txt were ignored
File msg does not exist
. Aborting.
An error occurred!


Anyone knows what I'm doing wrong?
Thank a lot!
Lien
Lien is offline   Reply With Quote
Old 12-09-2010, 07:24 AM   #2
skblazer
Member
 
Location: Massachusetts

Join Date: Feb 2009
Posts: 50
Default

I wrote absolute path of all window files from last stage in a text file as --inputFiles,

/your/path/xxxxwindow1.txt
/your/path/xxxxwindow2.txt
/your/path/xxxxwindow3.txt
....

That will work.
skblazer is offline   Reply With Quote
Old 12-12-2010, 10:18 PM   #3
Lien
Member
 
Location: Leuven

Join Date: Dec 2009
Posts: 47
Default

Oh, I interpreted it wrong, I thought they meant the content of those files.

It works fine now.
Thanks!
Lien is offline   Reply With Quote
Old 01-07-2011, 03:45 PM   #4
fitzgeraldlm
Junior Member
 
Location: Seattle

Join Date: Jan 2011
Posts: 4
Unhappy Dindel Stage 4 problems

Hi,

I'm having similar problems to Lien, however mine are unfortunately not solved yet I'm not sure if I understand your solution skblazer or the original instructions (very new to unix/linux).

Currently I am typing in all of the x.dindel_stage2_output files after --inputFiles. So for example if I have three output files I would type

python mergeOutputDiploid.py --inputFiles x.dindel_stage2_output.1.glf.txt x.dindel_stage2_output.2.glf.txt x.dindel_stage2_output.3.glf.txt --outputFile indel.VCF --ref hg18.fa

I get the same message as Lien did when:
WARNING: additional columns in line 1 of file x.dindel_stage2_output.1.glf.txt were ignored
File msg does not exist
. Aborting.
An error occurred!

I tried putting in the whole path name before each sample file thinking that was what you were suggesting skblazer, but it came back with the same message. Have I missed something? Am I supposed to combine all 3 files together beforehand? Is this what you are saying skblazer?

thanks!
fitzgeraldlm is offline   Reply With Quote
Old 01-07-2011, 04:30 PM   #5
skblazer
Member
 
Location: Massachusetts

Join Date: Feb 2009
Posts: 50
Default

You need create a file, for example "files.txt".
In this file, you should write the following lines:
/your/path/x.dindel_stage2_output.1.glf.txt
/your/path/x.dindel_stage2_output.2.glf.txt
/your/path/x.dindel_stage2_output.3.glf.txt

Then you type the command:
python mergeOutputDiploid.py --inputFiles files.txt --outputFile indel.VCF --ref hg18.fa

That'll work.

Quote:
Originally Posted by fitzgeraldlm View Post
Hi,

I'm having similar problems to Lien, however mine are unfortunately not solved yet I'm not sure if I understand your solution skblazer or the original instructions (very new to unix/linux).

Currently I am typing in all of the x.dindel_stage2_output files after --inputFiles. So for example if I have three output files I would type

python mergeOutputDiploid.py --inputFiles x.dindel_stage2_output.1.glf.txt x.dindel_stage2_output.2.glf.txt x.dindel_stage2_output.3.glf.txt --outputFile indel.VCF --ref hg18.fa

I get the same message as Lien did when:
WARNING: additional columns in line 1 of file x.dindel_stage2_output.1.glf.txt were ignored
File msg does not exist
. Aborting.
An error occurred!

I tried putting in the whole path name before each sample file thinking that was what you were suggesting skblazer, but it came back with the same message. Have I missed something? Am I supposed to combine all 3 files together beforehand? Is this what you are saying skblazer?

thanks!
skblazer is offline   Reply With Quote
Old 01-07-2011, 04:34 PM   #6
gaffa
Member
 
Location: Gothenburg/Uppsala, Sweden

Join Date: Oct 2010
Posts: 82
Default

fitzgeraldlm,

The argument to --inputFiles should be the name of a single text file, this text file in turn containing all the names of the output files. So if your example, you would create a new file with the following content:

Code:
x.dindel_stage2_output.1.glf.txt
x.dindel_stage2_output.2.glf.txt
x.dindel_stage2_output.3.glf.txt
That is, the literal names of the output files (you don't need absolute paths). Presumably the rationale behind this is that some runs can generate a very large number of output files, and so it would get difficult to specify them all on the command line. So instead you write all the file names to a text-file, and then the Dindel script looks into this text file. You can generate the text file either manually if you have a small number of output files or by a command like ls | grep ".glf.txt" > list_of_output_files.txt or similar (and then you'd specify --inputFiles list_of_output_files.txt)

EDIT: beaten by skblazer ;]

Last edited by gaffa; 01-07-2011 at 04:36 PM.
gaffa is offline   Reply With Quote
Old 01-08-2011, 02:01 PM   #7
fitzgeraldlm
Junior Member
 
Location: Seattle

Join Date: Jan 2011
Posts: 4
Default Solved

A big thank you to gaffa and skblazer. I have now got Stage 4 running and have a VCF file! Thanks for the tip on how to create the txt file gaffa. I did have to add the whole path name, like you suggested skblazer. This may have something to do with the way I installed (or didn't correctly install) Dindel.

Thanks!
fitzgeraldlm is offline   Reply With Quote
Old 05-02-2011, 10:49 AM   #8
ndiaye
Junior Member
 
Location: Montreal

Join Date: May 2011
Posts: 3
Default

Hi I am still facing a similar problem:
I've successfully went through first three stages of dindel variant calling, but getting the following error message when using mergeOutputPooled.py to generate the final vcf file. Note that cases_A.gene.ABCA1.glf.txt contain the name of my 10 glf.txt files.
Thank you for helping me to fix that.

[ndiayea@topaz] /shares/data/illumina_datastore/MI_20100215/analyses/Indels_calling/BAM_files/vcf_cases $ python /shares/home/ndiayea/programs/dindel-1.01-python/mergeOutputPooled.py --inputFiles ABCA1_cases_outputfiles.txt --outputFile ABCA1_cases_variantCalls.VCF --ref /shares/data/genome_datastore/homo_sapiens/Homo_sapiens_assembly18.fasta --numSamples 500 --numBamFiles 10
Reading cases_A.gene.ABCA1.glf.txt
An error occurred!
Traceback (most recent call last):
File "/shares/home/ndiayea/programs/dindel-1.01-python/mergeOutputPooled.py", line 620, in ?
main(sys.argv[1:])
File "/shares/home/ndiayea/programs/dindel-1.01-python/mergeOutputPooled.py", line 613, in main
processPooledGLFFiles(glfFilesFile = options.inputFiles, maxHPLen = options.maxHPLen, refFile = options.refFile, outputVCFFile = options.outputFile, doNotFilterOnFR = (not options.filterFR), filterQual = int(options.filterQual), numSamples = int(options.numSamples), numBamFiles = int(options.numBAMFiles))
File "/shares/home/ndiayea/programs/dindel-1.01-python/mergeOutputPooled.py", line 336, in processPooledGLFFiles
raise NameError('Inconsistent glf files! Is the number of BAM files correctly specified?')
NameError: Inconsistent glf files! Is the number of BAM files correctly specified?
ndiaye is offline   Reply With Quote
Old 08-03-2012, 01:53 AM   #9
clarissaboschi
Member
 
Location: US

Join Date: Apr 2010
Posts: 63
Default

I am having the same problem but I am using the option for pooled samples, I am not sure what to put in the numSamples, is the number of samples/individuals in one of my bam files?
In the outputFiles.txt I have the list of my 78 files.

my command line is
./dindel-1.01-linux-64bit mergeOutputPooled.py --inputFiles outputFiles.txt --outputFile variantCalls_hy7.VCF --ref chick.fa --numSamples 50 --numBamFiles 1

The error message was only
Error parsing input options

But I think my options are correct, what could be the problem?

Thanks
clarissaboschi is offline   Reply With Quote
Old 08-03-2012, 05:39 AM   #10
clarissaboschi
Member
 
Location: US

Join Date: Apr 2010
Posts: 63
Default

I solved my problem to run the stage 4, the input file (unique text file) should have the same name as the other files.

I am using numSamples of 10, this is the number of individuals in my pool, but there is no explanation in the manual about it.

But now I am having a vcf file empty, and of course I dont know what is the problem. Anyway the Dindel software is so difficult to run, there are so many steps!
clarissaboschi is offline   Reply With Quote
Reply

Tags
dindel

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:09 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO