Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Hi,
    I have been using the latest version of Pindel (0.2.5a2, September 17 2013) and each time got a segmentation fault even with window size as small as 1. Although, I am able to run the same input files with version 0.2.4t, August 13 2012.
    Have someone been through the same thing ?

    Comment


    • Originally posted by mlendale View Post
      Hi,
      I have been using the latest version of Pindel (0.2.5a2, September 17 2013) and each time got a segmentation fault even with window size as small as 1. Although, I am able to run the same input files with version 0.2.4t, August 13 2012.
      Have someone been through the same thing ?
      what is your complete command line?

      Comment


      • Hi,
        My command is
        pindel -f ~/Indexes/Homo_sapiens/WholeGenomeFasta/ucsc.hg19.fasta -T 8 -i Pindel_config_16102013.txt -o Pindel_out -w 1

        Comment


        • I have been having issues with the pindel2vcf companion script. I have been running into two errors specifically.

          I always use the -G for GATK compatibility and technically GATK is what is yelling ERRORS at me, but this appears to be issues with the generated vcf.

          The first appears if I try to split a multi-sampel vcf into their respective sample data:
          Badly formed genome loc: Parameters to GenomeLocParser are incorrect:The stop position 48367498 is less than start 48367499 in contig 11

          The second appears if I try to combine vcf files (from D.vcf, TD.vcf, etc.):
          GenomeLoc 1:16890671-16890672 has a size == 2 but the variation reference allele has length 3 this = [VC variant2 @ 1:16890671-16890672 Q. of type=MNP (etc...)

          I probably can not get support for this from the Broad since GATK did not make these files. I should note that these all see to be happening in the D file. The other types have, so far, been able to split into different sample.vcfs.

          Anyone have any suggestions?

          Comment


          • Hmm. Release notes for pindel0.2.4w say it fixed the GATK compatibility issue. I know I am running 0.2.4t. Does anyone know how to get version w? The Ubuntu instructions only seem to work with version t and I cant figure out if git-Hub is updated or not.

            Comment


            • mapping options for correctly generating BAM

              Hello Kai,

              I would like to try out Pindel but am a little uncertain about how to set my options for correctly generating the BAM alignment file. Could you provide a little detail on this? The paper says to filter for alignments where one pair can map and the other can't, but I am not sure how to do this since I thought both alignments would be suppressed if one of the reads can't find a match. Please let me know how you set up your read mapping options. Thanks!

              Michael

              Comment


              • Hi Kai,

                Can Pindel take the new alignments that it creates and output them into a BAM? Thanks!

                Alison

                Comment


                • What does genetype ("0/0", "0/1" or "1/1&quot in *.vcf file represent?

                  Dear Kai,

                  Recently I was analyzing some NGS data and genome polymorphysm. Though Pindel, I got the insertions and deletions of NGS against Reference genome. After Pindel2vcf the vcf format files were available. Here is my question: What dose the genetype ("0/0", "0/1" or "1/1") in the vcf file represent separately?

                  For example,

                  chr10_irgsp5 2279161 . CA C . PASS END=2279162;HOMLEN=9;HOMSEQ=AAAAAAAAA;SVLEN=-1;SVTYPE=DEL GT:AD 0/0:3,3 0/0:0,0

                  chr10_irgsp5 2313030 . CA C . PASS END=2313031;HOMLEN=10;HOMSEQ=AAAAAAAAAA;SVLEN=-1;SVTYPE=DEL GT:AD 0/0:1,2 0/0:2,2

                  chr10_irgsp5 2588340 . GTA G . PASS END=2588342;HOMLEN=3;HOMSEQ=TAT;SVLEN=-2;SVTYPE=DEL GT:AD 0/0:0,1 0/1:8,8

                  Thanks very much!

                  Yang

                  Comment


                  • Originally posted by lv06025158 View Post
                    Dear Kai,

                    Recently I was analyzing some NGS data and genome polymorphysm. Though Pindel, I got the insertions and deletions of NGS against Reference genome. After Pindel2vcf the vcf format files were available. Here is my question: What dose the genetype ("0/0", "0/1" or "1/1") in the vcf file represent separately?

                    For example,

                    chr10_irgsp5 2279161 . CA C . PASS END=2279162;HOMLEN=9;HOMSEQ=AAAAAAAAA;SVLEN=-1;SVTYPE=DEL GT:AD 0/0:3,3 0/0:0,0

                    chr10_irgsp5 2313030 . CA C . PASS END=2313031;HOMLEN=10;HOMSEQ=AAAAAAAAAA;SVLEN=-1;SVTYPE=DEL GT:AD 0/0:1,2 0/0:2,2

                    chr10_irgsp5 2588340 . GTA G . PASS END=2588342;HOMLEN=3;HOMSEQ=TAT;SVLEN=-2;SVTYPE=DEL GT:AD 0/0:0,1 0/1:8,8

                    Thanks very much!

                    Yang
                    Dear Yang,
                    The answer is in a previous post from Kai in the same thread.
                    If ref+alt<10, then give 0/0
                    else if (vaf between 0.2 and 0.8), 0/1
                    else if (vaf > 0.8) 1/1

                    ML

                    Comment


                    • Originally posted by mlendale View Post
                      Dear Yang,
                      The answer is in a previous post from Kai in the same thread.
                      If ref+alt<10, then give 0/0
                      else if (vaf between 0.2 and 0.8), 0/1
                      else if (vaf > 0.8) 1/1

                      ML
                      Thanks ML!
                      I got it. What you mentationed must be
                      "-mc/--min_coverage The minimum number of reads to provide a genotype (default 10)
                      -he/--het_cutoff The propertion of reads to call het (default 0.2)
                      -ho/--hom_cutoff The propertion of reads to call het (default 0.8)" from the help document of pindel2vcf.
                      Thanks again!
                      Yang

                      Comment


                      • Hello, I am a complete nextGen noob, and I am trying to do some work on my potato project, and I've gotten to this point, but now I am trying to run pindel using the improved workflow on the website. I finally got all the pathing *correct* and the command went through. The output files were created, but they are empty, and it only took about 1 minute to run the whole thing... That can't be right. I ran this:

                        pindel dennishalterman$ ./pindel -f /Users/dennishalterman/Desktop/NEXTGEN/PGSC_DM_v3_2.1.11_pseudomolecules.fasta -p bam_names.txt -c ch09 -o output/ref

                        and this was what returned:

                        Initializing parameters...
                        Pindel version 0.2.5a3, Oct 24 2013.
                        Loading reference genome ...
                        Loading reference genome done.
                        Initializing parameters done.
                        SearchRegion::SearchRegion
                        Segmentation fault: 11


                        Because there is no error, I am puzzled as to where to turn next... Any ideas?
                        Thanks so much, this website has been my saving grace the last few weeks.
                        A

                        Comment


                        • I have a usage question.

                          Is it better to make a config file for each sample and then run pindel on each one individually? Or should I just make one all inclusive config and run pindel and each sample at once?

                          I have been having issues with GATK compatibility of my resulting vcfs after doing the latter method. Perhaps my issues will be solved after upgrading to 0.2.5, but I was wondering if there is a preferred method of operation. Does pindel benefit from a multisample config in anyway?

                          Thanks.

                          -BWubb

                          Comment


                          • Originally posted by bwubb View Post
                            I have a usage question.

                            Is it better to make a config file for each sample and then run pindel on each one individually? Or should I just make one all inclusive config and run pindel and each sample at once?

                            I have been having issues with GATK compatibility of my resulting vcfs after doing the latter method. Perhaps my issues will be solved after upgrading to 0.2.5, but I was wondering if there is a preferred method of operation. Does pindel benefit from a multisample config in anyway?

                            Thanks.

                            -BWubb
                            better combined samples together and divide calculation by genomic regions. it is especially true your coverage is low.

                            Comment


                            • Originally posted by KaiYe View Post
                              better combined samples together and divide calculation by genomic regions. it is especially true your coverage is low.
                              Ah Thanks for the reply. Could you clarify "divide calculations by genomic regions"? Do you mean doing one chromosome at a time? Or something else? Thanks.

                              Comment


                              • Originally posted by bwubb View Post
                                Ah Thanks for the reply. Could you clarify "divide calculations by genomic regions"? Do you mean doing one chromosome at a time? Or something else? Thanks.
                                you can use various strategies to divide calculation over nodes.


                                1. -c chr:start-end or chr for entire chromosome
                                2. -j bed (chr start end): a list of regions you are interested.

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Techniques and Challenges in Conservation Genomics
                                  by seqadmin



                                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                  Avian Conservation
                                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                  03-08-2024, 10:41 AM
                                • seqadmin
                                  The Impact of AI in Genomic Medicine
                                  by seqadmin



                                  Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
                                  02-26-2024, 02:07 PM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 03-14-2024, 06:13 AM
                                0 responses
                                34 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-08-2024, 08:03 AM
                                0 responses
                                72 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-07-2024, 08:13 AM
                                0 responses
                                81 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-06-2024, 09:51 AM
                                0 responses
                                68 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X