SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   understanding pindel output (http://seqanswers.com/forums/showthread.php?t=26718)

frymor 01-21-2013 06:49 AM

understanding pindel output
 
Hi everybody,

I really tried to understand the different files I get when running pindel with the following command:
Code:

pindel -f Mus_musculus.NCBIM37.66.dna.fa -i bwa_trimmedData.txt -c ALL -o rimmedData_default
Well, I get a list of 7 different files - BP, INV, D, LI, SI, TD and CloseEndMapped, some of them are empty, some are not.
When I tried to understand the results I compared my output to the one on the pindel web site. Unfortunately it was not possible. :confused:

Here is an example of LI
Code:

########################################################
0        LI        ChrID MT1        790        + 4        791        - 4        A_bwa_trimmed75_default + 3 - 3        G_default + 1 - 1
GTATTAAAGTAAGCAAAAGAATCAAACATAAAAACGTTAGGTCAAGGTGTAGCCAATGAAATGGGAAGAAATGGGCTACAttttcttataaaagaacattactataccctttatgaaactaaaggactaaggaggatttagtagtaaattaagaatagag
                                                                      ATTGGCTACACCTTGACCTAACGTTTTTATGTTTGATTCTTTTGCTTACTTTAATACCTTTTTAGGGGTTGCTGAAGATG        +        22        60        A_default        @HWI-ST863:138:D0WT7ACXX:5:1210:19323:84582/2
                                                                      ATTGGCTACACCTTGACCTAACGTTTTTATGTTTGATTCTTTTGCTTACTTTAATACCTTTTTAGGGGTTGCTGAAGATG        +        106        60        A_default        @HWI-ST863:138:D0WT7ACXX:5:1308:13343:59960/2
                                                                      ATTGGCTACACCTTGACCTAACGTTTTTATGTTTGATTCTTTTGCTTACTTTAATATCTTTTTAGGGTTTGCTGAAGATG        +        145        60        A_default        @HWI-ST863:138:D0WT7ACXX:5:2108:5597:62957/1
                                                                      ATTGGCTACACCTTGACCTAACGTTTTTATGTTTGATTCTTTTGCTTACTNTAATACCTTTTTAGGGTTTGCTGAAGATG        +        69        60        G_default        @HWI-ST863:138:D0WT7ACXX:5:2209:8563:77307/2
--------------------------------------------------------
gtattaaagtaagcaaaagaatcaaacataaaaacgttaggtcaaggtgtagccaatgaaatgggaagaaatgggctacaTTTTCTTATAAAAGAACATTACTATACCCTTTATGAAACTAAAGGACTAAGGAGGATTTAGTAGTAAATTAAGAATAGAG
          CTAGATGGATATAAAGTACCGCCAAGTCCTTTGAGTTTTAAGCTATGGCTAGTAGTTCTCTGGCAAATAGTTTTGTTATA        -        1219        60        A_default        @HWI-ST863:138:D0WT7ACXX:5:2316:6189:1992/2
                                          CAAGGGGGAGCCAATGAAAGGAGAAGGATTATGCTAGATTTTCTTATAAAAGGACATTACTATACCATTTATGAAACTAA        -        1573        29        A_bwa_trimmed75_default        @HWI-ST863:138:D0WT7ACXX:5:2316:16450:36161/2
        TATATTGTTTATTACCATGTATATCTTTTCTTTTTTTTGTTATAATCTAATCTTTTTTTTTTTTTTTTTTTTTTTTTTAT        -        2175        29        A_default        @HWI-ST863:138:D0WT7ACXX:5:1206:12241:67098/2
          TTTTTTTTTTTTTTGTTTTTATTTCTAAAAAATAATTTTTCATATAAATTTTGTTTTTTATTTTTTTTTTTTTTTTTATA        -        1171        29        G_default        @HWI-ST863:138:D0WT7ACXX:5:2304:2584:12569/2
########################################################

I would like to know how to understand this list.
somehow the results from pindel doesn't match the bam file.
here is the sequence from the bam file for the last read from above:
Code:

samtools view G.bam | grep "HWI-ST863:138:D0WT7ACXX:5:2304:2584:12569"
HWI-ST863:138:D0WT7ACXX:5:2304:2584:12569      89      MT      582    50      90M    *      0      0      CTCAAAGGACTTGGCGGTACTTTATATCCATCTAGAGGAGCCTGTTCTATAATCGATAAACCCCGCTCTACCTCACCATCTCTTGCTAAT      BCDCCDDDCBBB>B@DDCBDEDEDDDCDDDDEDEEEDFFFFHGHHHJJJIJJIJJJIHA?JJJJJJJJHGJJIHFJIHFHHHFFFFFCCB      AS:i:0      XN:i:0  XM:i:0  XO:i:0  XG:i:0  NM:i:0  MD:Z:90 YT:Z:UU NH:i:1

the mapping qualities differ in both results, the positions are not the same. I find it quite difficult to interpret the results to an understandable read list.
Another fact I don't understand is the difference in the sequence. What do the letters in upper case stand for? what is the different between them to the lower case?
Where can I see the beginning of my insertion?

Is there a better manual to this software?
Is there a way to visualize the results, so that I can see the reads as an alignment?

Thanks
Assa

KaiYe 01-22-2013 02:44 PM

Indeed the Pindel documentation needs more work. Sorry for the inconvenience. Jirapong helped me on the text and I still need to update the wiki site. I will update the wiki site this week and give you a signal when I finish.

Kai


All times are GMT -8. The time now is 07:46 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.