Hey, I am running picard to remove pcr-duplicates using MarkDuplicates.jar
on bam file porduce by tophat(with fusion-serach on), and I encount:
Exception in thread "main" net.sf.picard.PicardException: Value was put into PairInfoMap more than once.
I extract the problematic alignment from the bam, I get:
HISEQ700708:110:C0TKRACXX:7:1301:12779:136004 99 chr1 24288534 50 71M chr10 112724798 0 TAATGATTGTACTTGATATTAAGTGTTCTCAACGGAGTAACTTTTAAGTGGAAACCAAGTTTAGATTTGGG *++++++++'+)+++++++++++++)+++))+++)++**()++++++++++++*++++*****(((((&&& AS:i:-3 XM:i:1 XO:i:0 XG:i:0 MD:Z:37A62 NM:i:1 XF:Z:1 chr1-chr1 24288604 71m118320545F29M CCCAAATCTAAACTTGGTTTCCACTTAAAAGTTACTCCGTTGAGAACACTTAATATCAAGTACAATCATTAAAGTTGCACTGATACTACCATTATATCAC &&&(((((*****++++*++++++++++++)(**++)+++))+++)+++++++++++++)+'++++++++*+)))&')**(***(((((('&&'''('&& XP:Z:chr10 112724798 22M35353N78M NH:i:1
HISEQ700708:110:C0TKRACXX:7:1301:12779:136004 99 chr1 118320545 50 29M chr10 112724798 0 AAGTTGCACTGATACTACCATTATATCAC +)))&')**(***(((((('&&'''('&& AS:i:-3 XM:i:1 XO:i:0 XG:i:0 MD:Z:37A62 NM:i:1 XF:Z:2 chr1-chr1 24288604 71m118320545F29M CCCAAATCTAAACTTGGTTTCCACTTAAAAGTTACTCCGTTGAGAACACTTAATATCAAGTACAATCATTAAAGTTGCACTGATACTACCATTATATCAC &&&(((((*****++++*++++++++++++)(**++)+++))+++)+++++++++++++)+'++++++++*+)))&')**(***(((((('&&'''('&& XP:Z:chr10 112724798 22M35353N78M NH:i:1
HISEQ700708:110:C0TKRACXX:7:1301:12779:136004 147 chr10 112724798 50 22M35353N78M chr1 24288604 0 TGACAACTACCTGCTGAAATTGGAAACCTGTCCAGTTTAAGTCGTCTTGGTCTGAGATATAACAGACTGTCAGCAATACCCAGATCATTAGCAAAATGCA &&&&"!$#!&'''''''%'('&&****(++*)')++')++(+*'+)++))*'&'+)+++**)*')*+++++*+++)*&++++*++++**))*(((((&&& XA:i:2 MD:Z:0A0A98 NM:i:2 XS:A:+ XP:Z:chr1-chr1 24288604 71m118320544F29M NH:i:1
yes, one end reported twice because of splited alignment, so, how to deal with this? or it is a bug ?
thanks for any discussion in advance.
on bam file porduce by tophat(with fusion-serach on), and I encount:
Exception in thread "main" net.sf.picard.PicardException: Value was put into PairInfoMap more than once.
I extract the problematic alignment from the bam, I get:
HISEQ700708:110:C0TKRACXX:7:1301:12779:136004 99 chr1 24288534 50 71M chr10 112724798 0 TAATGATTGTACTTGATATTAAGTGTTCTCAACGGAGTAACTTTTAAGTGGAAACCAAGTTTAGATTTGGG *++++++++'+)+++++++++++++)+++))+++)++**()++++++++++++*++++*****(((((&&& AS:i:-3 XM:i:1 XO:i:0 XG:i:0 MD:Z:37A62 NM:i:1 XF:Z:1 chr1-chr1 24288604 71m118320545F29M CCCAAATCTAAACTTGGTTTCCACTTAAAAGTTACTCCGTTGAGAACACTTAATATCAAGTACAATCATTAAAGTTGCACTGATACTACCATTATATCAC &&&(((((*****++++*++++++++++++)(**++)+++))+++)+++++++++++++)+'++++++++*+)))&')**(***(((((('&&'''('&& XP:Z:chr10 112724798 22M35353N78M NH:i:1
HISEQ700708:110:C0TKRACXX:7:1301:12779:136004 99 chr1 118320545 50 29M chr10 112724798 0 AAGTTGCACTGATACTACCATTATATCAC +)))&')**(***(((((('&&'''('&& AS:i:-3 XM:i:1 XO:i:0 XG:i:0 MD:Z:37A62 NM:i:1 XF:Z:2 chr1-chr1 24288604 71m118320545F29M CCCAAATCTAAACTTGGTTTCCACTTAAAAGTTACTCCGTTGAGAACACTTAATATCAAGTACAATCATTAAAGTTGCACTGATACTACCATTATATCAC &&&(((((*****++++*++++++++++++)(**++)+++))+++)+++++++++++++)+'++++++++*+)))&')**(***(((((('&&'''('&& XP:Z:chr10 112724798 22M35353N78M NH:i:1
HISEQ700708:110:C0TKRACXX:7:1301:12779:136004 147 chr10 112724798 50 22M35353N78M chr1 24288604 0 TGACAACTACCTGCTGAAATTGGAAACCTGTCCAGTTTAAGTCGTCTTGGTCTGAGATATAACAGACTGTCAGCAATACCCAGATCATTAGCAAAATGCA &&&&"!$#!&'''''''%'('&&****(++*)')++')++(+*'+)++))*'&'+)+++**)*')*+++++*+++)*&++++*++++**))*(((((&&& XA:i:2 MD:Z:0A0A98 NM:i:2 XS:A:+ XP:Z:chr1-chr1 24288604 71m118320544F29M NH:i:1
yes, one end reported twice because of splited alignment, so, how to deal with this? or it is a bug ?
thanks for any discussion in advance.
Comment