frymor |
01-21-2013 06:49 AM |
understanding pindel output
Hi everybody,
I really tried to understand the different files I get when running pindel with the following command:
Code:
pindel -f Mus_musculus.NCBIM37.66.dna.fa -i bwa_trimmedData.txt -c ALL -o rimmedData_default
Well, I get a list of 7 different files - BP, INV, D, LI, SI, TD and CloseEndMapped, some of them are empty, some are not.
When I tried to understand the results I compared my output to the one on the pindel web site. Unfortunately it was not possible. :confused:
Here is an example of LI
Code:
########################################################
0 LI ChrID MT1 790 + 4 791 - 4 A_bwa_trimmed75_default + 3 - 3 G_default + 1 - 1
GTATTAAAGTAAGCAAAAGAATCAAACATAAAAACGTTAGGTCAAGGTGTAGCCAATGAAATGGGAAGAAATGGGCTACAttttcttataaaagaacattactataccctttatgaaactaaaggactaaggaggatttagtagtaaattaagaatagag
ATTGGCTACACCTTGACCTAACGTTTTTATGTTTGATTCTTTTGCTTACTTTAATACCTTTTTAGGGGTTGCTGAAGATG + 22 60 A_default @HWI-ST863:138:D0WT7ACXX:5:1210:19323:84582/2
ATTGGCTACACCTTGACCTAACGTTTTTATGTTTGATTCTTTTGCTTACTTTAATACCTTTTTAGGGGTTGCTGAAGATG + 106 60 A_default @HWI-ST863:138:D0WT7ACXX:5:1308:13343:59960/2
ATTGGCTACACCTTGACCTAACGTTTTTATGTTTGATTCTTTTGCTTACTTTAATATCTTTTTAGGGTTTGCTGAAGATG + 145 60 A_default @HWI-ST863:138:D0WT7ACXX:5:2108:5597:62957/1
ATTGGCTACACCTTGACCTAACGTTTTTATGTTTGATTCTTTTGCTTACTNTAATACCTTTTTAGGGTTTGCTGAAGATG + 69 60 G_default @HWI-ST863:138:D0WT7ACXX:5:2209:8563:77307/2
--------------------------------------------------------
gtattaaagtaagcaaaagaatcaaacataaaaacgttaggtcaaggtgtagccaatgaaatgggaagaaatgggctacaTTTTCTTATAAAAGAACATTACTATACCCTTTATGAAACTAAAGGACTAAGGAGGATTTAGTAGTAAATTAAGAATAGAG
CTAGATGGATATAAAGTACCGCCAAGTCCTTTGAGTTTTAAGCTATGGCTAGTAGTTCTCTGGCAAATAGTTTTGTTATA - 1219 60 A_default @HWI-ST863:138:D0WT7ACXX:5:2316:6189:1992/2
CAAGGGGGAGCCAATGAAAGGAGAAGGATTATGCTAGATTTTCTTATAAAAGGACATTACTATACCATTTATGAAACTAA - 1573 29 A_bwa_trimmed75_default @HWI-ST863:138:D0WT7ACXX:5:2316:16450:36161/2
TATATTGTTTATTACCATGTATATCTTTTCTTTTTTTTGTTATAATCTAATCTTTTTTTTTTTTTTTTTTTTTTTTTTAT - 2175 29 A_default @HWI-ST863:138:D0WT7ACXX:5:1206:12241:67098/2
TTTTTTTTTTTTTTGTTTTTATTTCTAAAAAATAATTTTTCATATAAATTTTGTTTTTTATTTTTTTTTTTTTTTTTATA - 1171 29 G_default @HWI-ST863:138:D0WT7ACXX:5:2304:2584:12569/2
########################################################
I would like to know how to understand this list.
somehow the results from pindel doesn't match the bam file.
here is the sequence from the bam file for the last read from above:
Code:
samtools view G.bam | grep "HWI-ST863:138:D0WT7ACXX:5:2304:2584:12569"
HWI-ST863:138:D0WT7ACXX:5:2304:2584:12569 89 MT 582 50 90M * 0 0 CTCAAAGGACTTGGCGGTACTTTATATCCATCTAGAGGAGCCTGTTCTATAATCGATAAACCCCGCTCTACCTCACCATCTCTTGCTAAT BCDCCDDDCBBB>B@DDCBDEDEDDDCDDDDEDEEEDFFFFHGHHHJJJIJJIJJJIHA?JJJJJJJJHGJJIHFJIHFHHHFFFFFCCB AS:i:0 XN:i:0 XM:i:0 XO:i:0 XG:i:0 NM:i:0 MD:Z:90 YT:Z:UU NH:i:1
the mapping qualities differ in both results, the positions are not the same. I find it quite difficult to interpret the results to an understandable read list.
Another fact I don't understand is the difference in the sequence. What do the letters in upper case stand for? what is the different between them to the lower case?
Where can I see the beginning of my insertion?
Is there a better manual to this software?
Is there a way to visualize the results, so that I can see the reads as an alignment?
Thanks
Assa
|