Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Interpreting Pindel output

    Hello,

    Im using Pindel to detect SV's. For the most part I understand the output I am given, but there are a few types I do not. First, Im curious why Im getting some Deletions in my Inversions file, but that is no big deal.

    This is a called Inversion
    Code:
    ####################################################################################################
    3	INV 218	NT 0:0 "":""	ChrID 5	BP 131938052	131938271	BP_range 131938052	131938271	Supports 5	3	+ 1	1	- 4	2	S1 10	SUM_MS 145	1	NumSupSamples 1	1	3498_l5 1 1 4 2
    TAAGTTTATAATAAGACTCCTATTAGAGACCAGTTTAATTTATTCTACTGCTTTGTCATACTAATTCAATATAATTTTAAATAAGAATTTGGAATATTTCaaaataaaaattttttaaattacaggaaaaaaaggaaggaagccagccactaagtgaaatgctacatgggtttaaggtacaaaatgtcaacccattttac
                                                                                       AGAATTTGGAATATTTCAAAATAAAAATTTTTTAAATTACAGGAAAAAAAGGAAGGAAGCCAGCCACTAAGTGAAATGCTACATGGGTTTAAGGTACAAA	+	131938035	29	3498_l5	@HWI-ST628:225:C02ATACXX:5:1203:12456:188553/2
    ----------------------------------------------------------------------------------------------------
    ctaatgaattaccacctccatggcaggtactgacaactatttttgctgatgcctctgaaacaataatatgtatttaatcttttaaaaaaaatttacttcaGAAATAATGTTAGGATTACAGAAAAATTATAAAAATAATACAAATTATTCATATATATCCCTCATCCAGCTCCTCCTGATGTTAACAATTTATGTACTCT
                                   GACAACTATTTTTGCTGATGCCTCTGAAACAATAATATGTATTTAATCTTTTAAAAAAAATTTACTTCAGAAATAATGTTAGGATTACAGAAAAATTATA	-	131938301	29	3498_l5	@HWI-ST628:225:C02ATACXX:5:2301:14130:106182/2
                         GGCAGGTACTGACAACTATTTTTGCTGATGCCTCTGAAACAATAATATGTATTTAATCTTTTAAAAAAAATTTACTTCAGAAATAATGTTAGGATTACAG	-	131938291	29	3498_l5	@HWI-ST628:225:C02ATACXX:5:2305:4222:116460/2
                         GGCAGGTACTGACAACTATTTTTGCTGATGCCTCTGAAACAATAATATGTATTTAATCTTTTAAAAAAAATTTACTTCAGAAATAATGTTAGGATTACAG	-	131938291	29	3498_l5	@HWI-ST628:225:C02ATACXX:5:2104:12836:15953/2
                         GGCAGGTACTGACAACTATTTTTGCTGATGCCTCTGAAACAATAATATGTATTTAATCTTTTAAAAAAAATTTACTTCAGAAATAATGTTAGGATTACAG	-	131938291	29	3498_l5	@HWI-ST628:225:C02ATACXX:5:1206:4455:120825/2
    Is this trying to say that the 218 nucleotides between Chr5:131938052-131938271 are inverted? I guess the way the read mapping is displayed I dont see it; I dont get a sense of orientation. Assuming this INV is real and the inverted sequence is represented by the lowercase letters I should, in theory, be able to look up the ref sequence at that position and see the lowercase sequence displayed in reverse order?


    The other confusing output is the Large insertion:
    Code:
    ########################################################
    12	LI	ChrID 8	90973932	4	90973925	3
    TTCAAGGTGAGGAAGTGTGGGAACTATAAAAAATATGGCACACATATTCTGTAGAGAAACTATGTAAAAAAGGCGAGGTCGGGAGGAGGAAGGCTGCAGCtaaactactgaggactcagaagtctagacaaagggcttggagatttattctgtggataataagagctagttaaaatttttgagaaaagagaaggtaattg
                                                                                                GCTGCAGCCTTCCTCCTCCCGACCTCGCCTTTTTTACATAGTTTTTCTACAGAATATGTGTGCCATATTTTTTATAGTTCCCACACTTCCTCACCTTGAA
                                                                                                GCTGCAGCCTTCCTCCTCCCGACCTCGCCTTTTTTACATAGTTTCTCTACAGAATATGTGTGCCTTATTTTTTATAGTTCCCACACTTCCTCACCTTGAA
                                                                                                GCTGCAGCCTTCCTCCTCCCGACCTCGCCTTTTTTACATAGTTTCTCTACAGAATATGTGTGCCCTATTTTTTATAGTTCCCACACTTCCTCACCTTGAA
                               GAACTATAAAAACTATGGCACACATATTCTGTAGAGAAACTATGTAAAAAAGACGAGGTCTGGAGGAGGAAGGGTGCAGCTGAACTACTGAGGTCTCATA
    --------------------------------------------------------
    acatgatgttcaaggtgaggaagtgtgggaactataaaaaatatggcacacatattctgtagagaaactatgtaaaaaaggcgaggtcgggaggaggaagGCTGCAGCTAAACTACTGAGGACTCAGAAGTCTAGACAAAGGGCTTGGAGATTTATTCTGTGGATAATAAGAGCTAGTTAAAATTTTTGAGAAAAGAGAA
            TTGTCTTTTATCAAAATCTTTAACTAGCTTTTATTATCCACAGAATAAATCTCAAAGCCCTTTGTCTAGTCTTCTGAGTCCTCAGTAGTTTAGCTGCAGC
            TTCTCTTTTCTCAAAAATTTTAACTAGCTCTTATTGTCCACAGAATAAATTTCCAAGCCCTTTGTCTAGACTTCTGAGTCCTCAGTAGTTTAGCTGCAGC
            TTCTCTTTTCTCAAAAATTTTGACTAGCTCTTATTATCCACAGAATAAATTTCCAAGCCCTTTGTCTAGACTTCTGAGTCCTCAGTAGTTTAGCTGCAGC

    Based off of the BP location given, this insertion is only 7nt? That does not seem large. There does not seem to be any other indication of the insert size or sequence. The alignments displayed just looks like poor mapping. Maybe I should just ignore this file altogether.

    Does anyone know what to make of these? Thanks.

  • #2
    Originally posted by bwubb View Post
    Hello,

    Im using Pindel to detect SV's. For the most part I understand the output I am given, but there are a few types I do not. First, Im curious why Im getting some Deletions in my Inversions file, but that is no big deal.

    This is a called Inversion
    Code:
    ####################################################################################################
    3	INV 218	NT 0:0 "":""	ChrID 5	BP 131938052	131938271	BP_range 131938052	131938271	Supports 5	3	+ 1	1	- 4	2	S1 10	SUM_MS 145	1	NumSupSamples 1	1	3498_l5 1 1 4 2
    TAAGTTTATAATAAGACTCCTATTAGAGACCAGTTTAATTTATTCTACTGCTTTGTCATACTAATTCAATATAATTTTAAATAAGAATTTGGAATATTTCaaaataaaaattttttaaattacaggaaaaaaaggaaggaagccagccactaagtgaaatgctacatgggtttaaggtacaaaatgtcaacccattttac
                                                                                       AGAATTTGGAATATTTCAAAATAAAAATTTTTTAAATTACAGGAAAAAAAGGAAGGAAGCCAGCCACTAAGTGAAATGCTACATGGGTTTAAGGTACAAA	+	131938035	29	3498_l5	@HWI-ST628:225:C02ATACXX:5:1203:12456:188553/2
    ----------------------------------------------------------------------------------------------------
    ctaatgaattaccacctccatggcaggtactgacaactatttttgctgatgcctctgaaacaataatatgtatttaatcttttaaaaaaaatttacttcaGAAATAATGTTAGGATTACAGAAAAATTATAAAAATAATACAAATTATTCATATATATCCCTCATCCAGCTCCTCCTGATGTTAACAATTTATGTACTCT
                                   GACAACTATTTTTGCTGATGCCTCTGAAACAATAATATGTATTTAATCTTTTAAAAAAAATTTACTTCAGAAATAATGTTAGGATTACAGAAAAATTATA	-	131938301	29	3498_l5	@HWI-ST628:225:C02ATACXX:5:2301:14130:106182/2
                         GGCAGGTACTGACAACTATTTTTGCTGATGCCTCTGAAACAATAATATGTATTTAATCTTTTAAAAAAAATTTACTTCAGAAATAATGTTAGGATTACAG	-	131938291	29	3498_l5	@HWI-ST628:225:C02ATACXX:5:2305:4222:116460/2
                         GGCAGGTACTGACAACTATTTTTGCTGATGCCTCTGAAACAATAATATGTATTTAATCTTTTAAAAAAAATTTACTTCAGAAATAATGTTAGGATTACAG	-	131938291	29	3498_l5	@HWI-ST628:225:C02ATACXX:5:2104:12836:15953/2
                         GGCAGGTACTGACAACTATTTTTGCTGATGCCTCTGAAACAATAATATGTATTTAATCTTTTAAAAAAAATTTACTTCAGAAATAATGTTAGGATTACAG	-	131938291	29	3498_l5	@HWI-ST628:225:C02ATACXX:5:1206:4455:120825/2
    Is this trying to say that the 218 nucleotides between Chr5:131938052-131938271 are inverted? I guess the way the read mapping is displayed I dont see it; I dont get a sense of orientation. Assuming this INV is real and the inverted sequence is represented by the lowercase letters I should, in theory, be able to look up the ref sequence at that position and see the lowercase sequence displayed in reverse order?


    The other confusing output is the Large insertion:
    Code:
    ########################################################
    12	LI	ChrID 8	90973932	4	90973925	3
    TTCAAGGTGAGGAAGTGTGGGAACTATAAAAAATATGGCACACATATTCTGTAGAGAAACTATGTAAAAAAGGCGAGGTCGGGAGGAGGAAGGCTGCAGCtaaactactgaggactcagaagtctagacaaagggcttggagatttattctgtggataataagagctagttaaaatttttgagaaaagagaaggtaattg
                                                                                                GCTGCAGCCTTCCTCCTCCCGACCTCGCCTTTTTTACATAGTTTTTCTACAGAATATGTGTGCCATATTTTTTATAGTTCCCACACTTCCTCACCTTGAA
                                                                                                GCTGCAGCCTTCCTCCTCCCGACCTCGCCTTTTTTACATAGTTTCTCTACAGAATATGTGTGCCTTATTTTTTATAGTTCCCACACTTCCTCACCTTGAA
                                                                                                GCTGCAGCCTTCCTCCTCCCGACCTCGCCTTTTTTACATAGTTTCTCTACAGAATATGTGTGCCCTATTTTTTATAGTTCCCACACTTCCTCACCTTGAA
                               GAACTATAAAAACTATGGCACACATATTCTGTAGAGAAACTATGTAAAAAAGACGAGGTCTGGAGGAGGAAGGGTGCAGCTGAACTACTGAGGTCTCATA
    --------------------------------------------------------
    acatgatgttcaaggtgaggaagtgtgggaactataaaaaatatggcacacatattctgtagagaaactatgtaaaaaaggcgaggtcgggaggaggaagGCTGCAGCTAAACTACTGAGGACTCAGAAGTCTAGACAAAGGGCTTGGAGATTTATTCTGTGGATAATAAGAGCTAGTTAAAATTTTTGAGAAAAGAGAA
            TTGTCTTTTATCAAAATCTTTAACTAGCTTTTATTATCCACAGAATAAATCTCAAAGCCCTTTGTCTAGTCTTCTGAGTCCTCAGTAGTTTAGCTGCAGC
            TTCTCTTTTCTCAAAAATTTTAACTAGCTCTTATTGTCCACAGAATAAATTTCCAAGCCCTTTGTCTAGACTTCTGAGTCCTCAGTAGTTTAGCTGCAGC
            TTCTCTTTTCTCAAAAATTTTGACTAGCTCTTATTATCCACAGAATAAATTTCCAAGCCCTTTGTCTAGACTTCTGAGTCCTCAGTAGTTTAGCTGCAGC

    Based off of the BP location given, this insertion is only 7nt? That does not seem large. There does not seem to be any other indication of the insert size or sequence. The alignments displayed just looks like poor mapping. Maybe I should just ignore this file altogether.

    Does anyone know what to make of these? Thanks.
    thanks for the questions. We need to update Pindel wiki to explain our output format. Here are the answers to your questions:

    1. Deletions in inversion output
    We are able to find deletions with non-template insertions and in some cases, the length of inserted sequence is equal to the deleted one and they are reverse complementary, we put it to inversion. However we forgot to substitute type information from D to INV.

    2. For inversion, the lower case is the inverted sequence of the reference. And the reads are displayed as they are in fastq file. As there are two breakpoints from one inversion, we display how reads aligned to the breakpoints in the altered reference.

    3. For LI, long insertion, Pindel can find the breakpoints but cannot report the complete inserted sequence. The coordinate of the left breakpoint may be smaller than right due to target site duplication.

    Kai

    Comment


    • #3
      Thank you for the reply. It helped me to better annotate my SV information.

      I noticed the LI output does not have any information about which sample the read belongs to, such as the BP file.

      Is that intentional or is there a way to include that info?

      Thanks.

      Comment


      • #4
        Originally posted by bwubb View Post
        Thank you for the reply. It helped me to better annotate my SV information.

        I noticed the LI output does not have any information about which sample the read belongs to, such as the BP file.

        Is that intentional or is there a way to include that info?

        Thanks.
        The LI and BP modules were added latest and still being modified. It is hard to make the summary line consistent as the other types of variants due to their properties. But we can certainly report the sample information.

        Kai

        Comment


        • #5
          Is there any general criteria or cutoffs for assessing the results? I figure as a rule the more reads mapped to a particular SV the better, but Im unclear if there is a recommended minimum number of unique reads. Maybe a SUM_MS cut-off as well?

          Thank you.

          Comment


          • #6
            Originally posted by bwubb View Post
            Is there any general criteria or cutoffs for assessing the results? I figure as a rule the more reads mapped to a particular SV the better, but Im unclear if there is a recommended minimum number of unique reads. Maybe a SUM_MS cut-off as well?

            Thank you.
            Indeed the more supporting reads the more confident calls. SUM_MS is also a good score and the number of samples is an indication of frequency. You may wish to check the average coverage of your data and adjust cutoff accordingly.

            Comment


            • #7
              Hi, KaiYe:

              "3. For LI, long insertion, Pindel can find the breakpoints but cannot report the complete inserted sequence. "

              is there anyway to find the inserted sequence? Thanks.

              Jack

              Comment


              • #8
                Originally posted by xc611 View Post
                Hi, KaiYe:

                "3. For LI, long insertion, Pindel can find the breakpoints but cannot report the complete inserted sequence. "

                is there anyway to find the inserted sequence? Thanks.

                Jack
                you may try the new assembly module in Pindel, -z option. It is able to provide inserted sequenced longer than the read length but we haven't push it to longer than 2 x read length.

                Comment


                • #9
                  interpreting Pindel output correctly

                  I used Pindel to detect CNV, I got the following files:
                  pindelresult_BP
                  pindelresult_D
                  pindelresult_INT_final
                  pindelresult_LI
                  pindelresult_SI
                  pindelresult_CloseEndMapped
                  pindelresult_INT
                  pindelresult_INV
                  pindelresult_RP pindelresult_TD
                  I wanted to know why the files: pindelresult_BP, pindelresult_CloseEndMapped, and pindelresult_LI were empty.
                  And how should I understand the files correctly: pindelresult_INT_final, pindelresult_INT, pindelresult_INV, pindelresult_RP, pindelresult_TD?
                  In the file pindelresult_TD and the file pindelresult_INV, is there any difference between upper case and lower case?
                  Thanks!

                  Comment


                  • #10
                    Originally posted by binlangman View Post
                    I used Pindel to detect CNV, I got the following files:
                    pindelresult_BP
                    pindelresult_D
                    pindelresult_INT_final
                    pindelresult_LI
                    pindelresult_SI
                    pindelresult_CloseEndMapped
                    pindelresult_INT
                    pindelresult_INV
                    pindelresult_RP pindelresult_TD
                    I wanted to know why the files: pindelresult_BP, pindelresult_CloseEndMapped, and pindelresult_LI were empty.
                    And how should I understand the files correctly: pindelresult_INT_final, pindelresult_INT, pindelresult_INV, pindelresult_RP, pindelresult_TD?
                    In the file pindelresult_TD and the file pindelresult_INV, is there any difference between upper case and lower case?
                    Thanks!
                    upper and lower case define junction

                    Comment


                    • #11
                      Hello,
                      I have a problem to understand the Pindel output for deletions, if the deletion is not 'pure'. For example:

                      PHP Code:
                      279     D 24    NT 16 "GAAGAGAAGAGACAAG"        ChrID chr5      BP 145645795    145645820       BP_range 145645795      145645820       Supports 14     10      13    9       1     1       S1 28   SUM_MS 406      1       NumSupSamples 1 1       C0443 13 9 1 1
                      GAGCTTTGGGCCCAGGAATTCCCTGTTTCTTAAAAAGTCCTTGTGTAAGGGTTCCATATCTATAAGGGAAACGGAAACACTTGCATCCACACACACACACA                CATCATCAAAAACTGAAACAAATCAACAACTGGGCTCCCTCCTTGCCCATTTCCCAGAGAGCTTTGTGAATAGTGAATTTGCATGTTAGCCAATTGCTGCT
                                                                             ATATCTATAAGGGAAACAGAAACACTTGCATCCACACACACACACAGAAGAGAAGAGACAAGCATCATCAAAAACTGAAACAAATCAACAACTG           
                      -       145646019       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:1315:2169:66627/1
                                                                                           AACAGAAACACTTGCATCCACACACACACACAGAAGAGAAGAGACAAGCATCATCAAAAACTGAAACAAATCAACAACTGGGCTCCCTCCTCACCCATTTC              
                      +       145645563       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:2302:16529:90827/1
                                                                                          AAACAGAAACACTTGCATCCACACACACACACAGAAGAGAAGAGACAAGCATCATCAAAAACTGAAACAAATCAACAACTGGGCTCCCTCCTCACCCATTT               
                      +       145645566       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:2209:4054:76740/1
                                                                                    TAAGGGAAACAGAAACACTTGCATCCACACACACACACAGAAGAGAAGAGACAAGCATCATCAAAAACTGAAACAAATCAACAACTGGGCTCCCTCCTCAC             
                      +       145645567       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:2110:2653:6324/1
                                                                                    TAAGGGAAACAGAAACACTTGCATCCACACACACACACAGAAGAGAAGAGACAAGCATCATCAAAAACTGAAACAAATCAACAACTGGGCTCCCTCCTCAC             
                      +       145645567       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:1303:4839:52707/1
                                                                              TATCTATAAGGGTAACAGAAACACTTGCATCCACACACACACACAGAAGAGAAGAGACAAGCATCATCAAAAACTGAAACAAATCAACAACTGGGCTCCCT           
                      +       145645566       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:2207:9082:15442/1
                                                                        GTTCCATATCTATAAGGGAAACAGAAACACTTGCATCCACACACACACACAGAAGAGAAGAGACAAGCATCATCAAAAACTGATACAAATCAACAACTGGG         
                      +       145645546       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:2216:14774:37713/1
                                                                     AGGGTTCCATATCTATAAGGGAAACAGAAACGCTTGCATCCACACACACACACAGAAGAGAAGAGACAAGCATCATCAAAAACTGAAACAAATCAA         
                      +       145645523       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:1212:10023:72896/1
                                                               TGTGTAAGGGTTCCATATCTATAAGGGAAACAGAAACACTTGCATCCACACACACACACAGAAGAGAAGAGACAAGCATCATCGAAAACTGAAACAAATCA          
                      +       145645565       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:2216:5700:53182/1
                                                               TGTGTAAGGGTTCCATATCTATAAGGGAAACAGAAACACTTGCATCCACACACACACACAGAAGAGAAGAGACAAGCATCATCAAAAACTGAAACAAATCA          
                      +       145645563       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:1106:18885:45479/2
                                                    TAAAAAGTCCTTGTGTAAGGGTTCCATATCTATAAGGGAAACAGAAACACTTGCATCCACACACACACACAGAAGAGAAGAGACAAGCATCATCAAAAACT             
                      +       145645502       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:2311:2742:60334/1
                                                 TCTTAAAAAGTCCTTGTGTAAGGGTTCCATATCTATAAGGGAAACAGAAACACTTGCATCCACACACACACACAGAAGAGAAGAGACAAGCATCATCAAAA                
                      +       145645537       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:2214:20403:98401/1
                                                 TCTTAAAAAGTCCTTGTGTAAGGGTTCCATATCTATAAGGGAAACAGAAACACTTGCATCCACACACACACACAGAAGAGAAGAGACAAGCATCATCAAAA                
                      +       145645537       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:2207:18755:26282/1
                                                 TCTTAAAAAGTCCTTGTGTAAGGGTTCCATATCTATAAGGGAAACAGAAACACTTGCATCCACACACACACACAGAAGAGAAGAGACAAGCATCATCAAAA                
                      +       145645568       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:1110:5277:17743/1
                      ####################################################################################################
                      280     D 4     NT 3 "GAA"      ChrID chr5      BP 145806145    145806150       BP_range 145806145      145806150       Supports 3      2       1     1       2     1       S1 6    SUM_MS 180      1       NumSupSamples 1 1       C0443 1 1 2 1
                      GTGACATCAGTAAACAACAGTGCCATGTGAGTAAGGCCAAAGGATCTTGGTTTCTATCATAAATTCAAGCAAATTCAACAATATGAAACACCCCCTCACCA   TGGCTTGATTTAAAAATACACTCAGACAGTAGAAGCAGGAGCCTCAGAAATTCAAAGACAAAATTCAAAACTATATGAAATGTTTTAGACCTGCCTGAGAT
                                                                                     TTCAAGCAAATTCAACAATATGAAACACCCCCTCACCAGAATGGCTTGATTTAAAAATACACTCAGACAGTAGAAACAGGAGTCTCAGAAATTCAAAGACA            
                      -       145806372       60      C0443   @HWI-ST778:145:C1RF5ACXX:7:2215:6540:70269/2
                                                                                     TTCAAGCAAATTCAACAATATGAAACACCCCCTCACCAGAATGGCTTGATTTAAAAATACACTCAGACAGTAGAAACAGGAGTCTCAGAAATTCAAAGACA            
                      -       145806372       60      C0443   @HWI-ST778:145:C1RF5ACXX:7:1301:12004:59656/2
                                                                                                     AATATGAAACACCCCCTCACCAGAATGGCTTGATTTAAAAATACACTCAGACAGTAGAAACAGGAGTCTCAGAAATTCAAAGACAAAATTCAAAACCATAT            
                      +       145805780       60      C0443   @HWI-ST778:145:C1RF5ACXX:7:2106:3364:91429/2
                      ####################################################################################################
                      281     D 26    NT 29 "GTAAGGGAAAGTAGAAAAGAACTTTGAAG"   ChrID chr5      BP 146261205    146261232       BP_range 146261205      146261232       Supports 11     9       8     7       3     2       S1 36   SUM_MS 319      1       NumSupSamples 1 1       C0443 8 7 3 2
                      GCTCTTCCTGGAGTCGGATTGCTTGGGAATGCAGCCCAAAGCGGGTGGTAAACTCCATCTAAGGCTAAATACCGGCACGAGACCGATAGTCAACAAGTACC                             AGAGAGTTCAAGAGGGCGTGAAACCGTTAAGAGGTAAACTGGTGGGGTCCGCGCAGTCCGCCCGGAGGATTCAACCCGGCGGCGCGCGTCCGCCATGCCGG
                                                                                            ACCGGCACGAGACCGATAGTCAACAAGTACCGTAAGGGAAAGTAGAAAAGAACTTTGAAGAGAGAGTTCAAGAGGGCGTGAAACCGTTAAGAGGTAAACGG             
                      -       146261560       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:2303:4348:99716/2
                                                                                            ACCGGCACGAGACCGATAGTCAACAAGTACCGTAAGGGAAAGTTGAAAAGAACTTTGAAGAGAGAGTTCAAGAGGGCGTGAAACCGTTAAGAGGAAAACGG             
                      -       146261560       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:1208:4133:85533/2
                                                                                              CGGCACGAGACCGATAGTCAACAAGTACCGTAAGGGAAAGTTGAAAAGAACTTTGAAGAGAGAGTTCAAGAGGGCGTGAAACCGTTAAGAGGTAAACGGG            
                      -       146261560       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:2211:10099:97288/2
                                                                                                              GTCAACAAGTACCGTAAGGGAAAGTTGAAAAGAACTTTGAAGAGAGAGTTCAAGAGGGCGTGAAACCGTTAAGAGGTAAACGGGTGGGGTCCGCGCAGTCC           
                      +       146260901       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:1312:9595:100309/2
                                                                                                              GTCAACAAGTACCGTAAGGGAAAGTTGAAAAGAACTTTGAAGAGAGAGTTCAAGAGGGCGTGAAACCGTTAAGAGGTAAACGGGTGGGGTCCGCGCAGTCC           
                      +       146260901       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:1316:4495:14108/2
                                                                                                                  ACAAGTACCGTAAGGGAAAGTTGAAAAGAACTTTGAAGAGAGAGTTCAAGAGGGCGTGAAACCGTTAAGAGGTAAACGGGTGGGGTCCGCGCAGTCCGCCC               
                      +       146260904       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:1115:10666:13720/2
                                                                                                       ACCGATAGTCAACAAGTACCGTAAGGGAAAGTTGAAAAGAACTTTGAAGAGAGAGTTCAAGAGGGCGTGAAACCGTTAAGAGGTAAACGGGTGGGGTCC            
                      +       146260891       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:2104:11397:19144/2
                                                                                                               TCAACAAGTACCGTAAGGGAAAGTTGAAAAGAACTTTGAAGAGAGAGTTCAAGAGGGCGTGAAACCGTTAAGAGGTAAACGGGTGGGGTCCGCGCAGTCCG          
                      +       146261012       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:1313:15194:20080/2
                                                                                                  ACGAGACCGATAGTCAACAAGTACCGTAAGGGAAAGTTGAAAAGAACTTTGAAGAGAAAGTTCAAGAGGGCGTGAAACCGTTAAGAGGTAAACGGGTGGGG               
                      +       146260996       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:2103:21213:55739/2
                                                                                                  ACGAGACCGATAGTCAACAAGTACCGTAAGGGAAAGTTGAAAAGAACTTTGAAGAGAGAGTTCAAGA         
                      +       146260856       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:1307:6044:75864/2
                                                                                      CTAAATACCGGCACGAGACCGATAGTCAACAAGTACCGTAAGGGAAAGTTGAAAAGAACTTTGAAGAGAGAGTTCAAGAGGGCGTGAAACCGTTAAGAGGT           
                      +       146260998       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:2112:20449:23425/
                      I understand, which bases are inserted, but I can't see the deleted bases. Can somebody explain that for the three examples, please?

                      Best
                      Robby

                      Comment


                      • #12
                        Originally posted by Robby View Post
                        Hello,
                        I have a problem to understand the Pindel output for deletions, if the deletion is not 'pure'. For example:

                        PHP Code:
                        279     D 24    NT 16 "GAAGAGAAGAGACAAG"        ChrID chr5      BP 145645795    145645820       BP_range 145645795      145645820       Supports 14     10      13    9       1     1       S1 28   SUM_MS 406      1       NumSupSamples 1 1       C0443 13 9 1 1
                        GAGCTTTGGGCCCAGGAATTCCCTGTTTCTTAAAAAGTCCTTGTGTAAGGGTTCCATATCTATAAGGGAAACGGAAACACTTGCATCCACACACACACACA                CATCATCAAAAACTGAAACAAATCAACAACTGGGCTCCCTCCTTGCCCATTTCCCAGAGAGCTTTGTGAATAGTGAATTTGCATGTTAGCCAATTGCTGCT
                                                                               ATATCTATAAGGGAAACAGAAACACTTGCATCCACACACACACACAGAAGAGAAGAGACAAGCATCATCAAAAACTGAAACAAATCAACAACTG           
                        -       145646019       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:1315:2169:66627/1
                                                                                             AACAGAAACACTTGCATCCACACACACACACAGAAGAGAAGAGACAAGCATCATCAAAAACTGAAACAAATCAACAACTGGGCTCCCTCCTCACCCATTTC              
                        +       145645563       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:2302:16529:90827/1
                                                                                            AAACAGAAACACTTGCATCCACACACACACACAGAAGAGAAGAGACAAGCATCATCAAAAACTGAAACAAATCAACAACTGGGCTCCCTCCTCACCCATTT               
                        +       145645566       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:2209:4054:76740/1
                                                                                      TAAGGGAAACAGAAACACTTGCATCCACACACACACACAGAAGAGAAGAGACAAGCATCATCAAAAACTGAAACAAATCAACAACTGGGCTCCCTCCTCAC             
                        +       145645567       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:2110:2653:6324/1
                                                                                      TAAGGGAAACAGAAACACTTGCATCCACACACACACACAGAAGAGAAGAGACAAGCATCATCAAAAACTGAAACAAATCAACAACTGGGCTCCCTCCTCAC             
                        +       145645567       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:1303:4839:52707/1
                                                                                TATCTATAAGGGTAACAGAAACACTTGCATCCACACACACACACAGAAGAGAAGAGACAAGCATCATCAAAAACTGAAACAAATCAACAACTGGGCTCCCT           
                        +       145645566       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:2207:9082:15442/1
                                                                          GTTCCATATCTATAAGGGAAACAGAAACACTTGCATCCACACACACACACAGAAGAGAAGAGACAAGCATCATCAAAAACTGATACAAATCAACAACTGGG         
                        +       145645546       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:2216:14774:37713/1
                                                                       AGGGTTCCATATCTATAAGGGAAACAGAAACGCTTGCATCCACACACACACACAGAAGAGAAGAGACAAGCATCATCAAAAACTGAAACAAATCAA         
                        +       145645523       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:1212:10023:72896/1
                                                                 TGTGTAAGGGTTCCATATCTATAAGGGAAACAGAAACACTTGCATCCACACACACACACAGAAGAGAAGAGACAAGCATCATCGAAAACTGAAACAAATCA          
                        +       145645565       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:2216:5700:53182/1
                                                                 TGTGTAAGGGTTCCATATCTATAAGGGAAACAGAAACACTTGCATCCACACACACACACAGAAGAGAAGAGACAAGCATCATCAAAAACTGAAACAAATCA          
                        +       145645563       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:1106:18885:45479/2
                                                      TAAAAAGTCCTTGTGTAAGGGTTCCATATCTATAAGGGAAACAGAAACACTTGCATCCACACACACACACAGAAGAGAAGAGACAAGCATCATCAAAAACT             
                        +       145645502       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:2311:2742:60334/1
                                                   TCTTAAAAAGTCCTTGTGTAAGGGTTCCATATCTATAAGGGAAACAGAAACACTTGCATCCACACACACACACAGAAGAGAAGAGACAAGCATCATCAAAA                
                        +       145645537       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:2214:20403:98401/1
                                                   TCTTAAAAAGTCCTTGTGTAAGGGTTCCATATCTATAAGGGAAACAGAAACACTTGCATCCACACACACACACAGAAGAGAAGAGACAAGCATCATCAAAA                
                        +       145645537       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:2207:18755:26282/1
                                                   TCTTAAAAAGTCCTTGTGTAAGGGTTCCATATCTATAAGGGAAACAGAAACACTTGCATCCACACACACACACAGAAGAGAAGAGACAAGCATCATCAAAA                
                        +       145645568       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:1110:5277:17743/1
                        ####################################################################################################
                        280     D 4     NT 3 "GAA"      ChrID chr5      BP 145806145    145806150       BP_range 145806145      145806150       Supports 3      2       1     1       2     1       S1 6    SUM_MS 180      1       NumSupSamples 1 1       C0443 1 1 2 1
                        GTGACATCAGTAAACAACAGTGCCATGTGAGTAAGGCCAAAGGATCTTGGTTTCTATCATAAATTCAAGCAAATTCAACAATATGAAACACCCCCTCACCA   TGGCTTGATTTAAAAATACACTCAGACAGTAGAAGCAGGAGCCTCAGAAATTCAAAGACAAAATTCAAAACTATATGAAATGTTTTAGACCTGCCTGAGAT
                                                                                       TTCAAGCAAATTCAACAATATGAAACACCCCCTCACCAGAATGGCTTGATTTAAAAATACACTCAGACAGTAGAAACAGGAGTCTCAGAAATTCAAAGACA            
                        -       145806372       60      C0443   @HWI-ST778:145:C1RF5ACXX:7:2215:6540:70269/2
                                                                                       TTCAAGCAAATTCAACAATATGAAACACCCCCTCACCAGAATGGCTTGATTTAAAAATACACTCAGACAGTAGAAACAGGAGTCTCAGAAATTCAAAGACA            
                        -       145806372       60      C0443   @HWI-ST778:145:C1RF5ACXX:7:1301:12004:59656/2
                                                                                                       AATATGAAACACCCCCTCACCAGAATGGCTTGATTTAAAAATACACTCAGACAGTAGAAACAGGAGTCTCAGAAATTCAAAGACAAAATTCAAAACCATAT            
                        +       145805780       60      C0443   @HWI-ST778:145:C1RF5ACXX:7:2106:3364:91429/2
                        ####################################################################################################
                        281     D 26    NT 29 "GTAAGGGAAAGTAGAAAAGAACTTTGAAG"   ChrID chr5      BP 146261205    146261232       BP_range 146261205      146261232       Supports 11     9       8     7       3     2       S1 36   SUM_MS 319      1       NumSupSamples 1 1       C0443 8 7 3 2
                        GCTCTTCCTGGAGTCGGATTGCTTGGGAATGCAGCCCAAAGCGGGTGGTAAACTCCATCTAAGGCTAAATACCGGCACGAGACCGATAGTCAACAAGTACC                             AGAGAGTTCAAGAGGGCGTGAAACCGTTAAGAGGTAAACTGGTGGGGTCCGCGCAGTCCGCCCGGAGGATTCAACCCGGCGGCGCGCGTCCGCCATGCCGG
                                                                                              ACCGGCACGAGACCGATAGTCAACAAGTACCGTAAGGGAAAGTAGAAAAGAACTTTGAAGAGAGAGTTCAAGAGGGCGTGAAACCGTTAAGAGGTAAACGG             
                        -       146261560       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:2303:4348:99716/2
                                                                                              ACCGGCACGAGACCGATAGTCAACAAGTACCGTAAGGGAAAGTTGAAAAGAACTTTGAAGAGAGAGTTCAAGAGGGCGTGAAACCGTTAAGAGGAAAACGG             
                        -       146261560       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:1208:4133:85533/2
                                                                                                CGGCACGAGACCGATAGTCAACAAGTACCGTAAGGGAAAGTTGAAAAGAACTTTGAAGAGAGAGTTCAAGAGGGCGTGAAACCGTTAAGAGGTAAACGGG            
                        -       146261560       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:2211:10099:97288/2
                                                                                                                GTCAACAAGTACCGTAAGGGAAAGTTGAAAAGAACTTTGAAGAGAGAGTTCAAGAGGGCGTGAAACCGTTAAGAGGTAAACGGGTGGGGTCCGCGCAGTCC           
                        +       146260901       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:1312:9595:100309/2
                                                                                                                GTCAACAAGTACCGTAAGGGAAAGTTGAAAAGAACTTTGAAGAGAGAGTTCAAGAGGGCGTGAAACCGTTAAGAGGTAAACGGGTGGGGTCCGCGCAGTCC           
                        +       146260901       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:1316:4495:14108/2
                                                                                                                    ACAAGTACCGTAAGGGAAAGTTGAAAAGAACTTTGAAGAGAGAGTTCAAGAGGGCGTGAAACCGTTAAGAGGTAAACGGGTGGGGTCCGCGCAGTCCGCCC               
                        +       146260904       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:1115:10666:13720/2
                                                                                                         ACCGATAGTCAACAAGTACCGTAAGGGAAAGTTGAAAAGAACTTTGAAGAGAGAGTTCAAGAGGGCGTGAAACCGTTAAGAGGTAAACGGGTGGGGTCC            
                        +       146260891       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:2104:11397:19144/2
                                                                                                                 TCAACAAGTACCGTAAGGGAAAGTTGAAAAGAACTTTGAAGAGAGAGTTCAAGAGGGCGTGAAACCGTTAAGAGGTAAACGGGTGGGGTCCGCGCAGTCCG          
                        +       146261012       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:1313:15194:20080/2
                                                                                                    ACGAGACCGATAGTCAACAAGTACCGTAAGGGAAAGTTGAAAAGAACTTTGAAGAGAAAGTTCAAGAGGGCGTGAAACCGTTAAGAGGTAAACGGGTGGGG               
                        +       146260996       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:2103:21213:55739/2
                                                                                                    ACGAGACCGATAGTCAACAAGTACCGTAAGGGAAAGTTGAAAAGAACTTTGAAGAGAGAGTTCAAGA         
                        +       146260856       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:1307:6044:75864/2
                                                                                        CTAAATACCGGCACGAGACCGATAGTCAACAAGTACCGTAAGGGAAAGTTGAAAAGAACTTTGAAGAGAGAGTTCAAGAGGGCGTGAAACCGTTAAGAGGT           
                        +       146260998       29      C0443   @HWI-ST778:145:C1RF5ACXX:7:2112:20449:23425/
                        I understand, which bases are inserted, but I can't see the deleted bases. Can somebody explain that for the three examples, please?

                        Best
                        Robby
                        If you convert output file to vcf, the ref and alt alleles will be reported. Indeed the deleted sequence is not displayed in the raw output but this was the way I designed.

                        Comment

                        Latest Articles

                        Collapse

                        • seqadmin
                          Techniques and Challenges in Conservation Genomics
                          by seqadmin



                          The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                          Avian Conservation
                          Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                          03-08-2024, 10:41 AM
                        • seqadmin
                          The Impact of AI in Genomic Medicine
                          by seqadmin



                          Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
                          02-26-2024, 02:07 PM

                        ad_right_rmr

                        Collapse

                        News

                        Collapse

                        Topics Statistics Last Post
                        Started by seqadmin, 03-14-2024, 06:13 AM
                        0 responses
                        32 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 03-08-2024, 08:03 AM
                        0 responses
                        71 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 03-07-2024, 08:13 AM
                        0 responses
                        80 views
                        0 likes
                        Last Post seqadmin  
                        Started by seqadmin, 03-06-2024, 09:51 AM
                        0 responses
                        68 views
                        0 likes
                        Last Post seqadmin  
                        Working...
                        X