![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
How to use Picard's MarkDuplicates | cliff | Bioinformatics | 12 | 01-26-2015 11:56 PM |
Picard MarkDuplicates error for RNA-Seq | RockChalkJayhawk | Bioinformatics | 6 | 07-11-2012 03:07 PM |
Error "RG ID on SAMRecord not found in header" from Picard's MarkDuplicates.jar | cliff | Bioinformatics | 4 | 11-10-2011 04:27 AM |
MarkDuplicates in picard | bair | Bioinformatics | 3 | 12-23-2010 12:00 PM |
Picard MarkDuplicates | wangzkai | Bioinformatics | 2 | 05-18-2010 10:14 PM |
![]() |
|
Thread Tools |
![]() |
#1 | |
Member
Location: long island Join Date: May 2011
Posts: 22
|
![]()
Dear All
I am still on the learning curve with the GATK tool but I encountered an error at the duplicates marking step with Picard tool. The procedure I did is the following: I generated bam files for each sample using tophat 1.33 and I sorted each bam file (one file per sample) using picard ReorderSam.jar to hg19 reference genome. After that I added read group information using Picard AddOrReplaceReadGroups.jar. Then I tried to remove pair duplicates using the MarkDuplicates.jar in Picard. However, I encountered error at this step and failed to generated the duplicates-removed files after running the Picard code. The errors I received are like the following: Quote:
I read the log carefully but cannot figure out the source of error. What does "Value was put into PairInfoMap more than once" mean here? Can you help me resolve this problem? Thanks a lot |
|
![]() |
![]() |
![]() |
#2 |
Junior Member
Location: Copenhagen Join Date: Oct 2010
Posts: 4
|
![]()
Hello,
I had the same issue, does someone has any clues about this? Thanks, |
![]() |
![]() |
![]() |
#3 |
Junior Member
Location: Copenhagen Join Date: Oct 2010
Posts: 4
|
![]()
I am answering myself,
it was due to fake read mapped with bwa such as: (null) 73 chr21 48313514 25 0M = 48313514 0 * * XT:A:U NM:i:0 SM:i:25 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:0 (null) 73 chr21 48313514 25 0M = 48313514 0 * * XT:A:U NM:i:0 SM:i:25 AM:i:0 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:0 (null) 65 chr21 48313514 25 0M chr18 18626503 0 * * XT:A:U NM:i:0 SM:i:25 AM:i:25 X0:i:1 X1:i:0 XM:i:1 XO:i:0 XG:i:0 MD:Z:0 (null) was found more than twice and MarkDuplicates complained. By increasing the mapping quality to 26 we can get rid of them or using samtools view -f 0x2 since they are not properly paired. |
![]() |
![]() |
![]() |
#4 |
Member
Location: Pittsburgh Join Date: Aug 2011
Posts: 72
|
![]()
I am running into the same error with picardmarkduplicates. My alignment was done with bowtie2. I have run this script before on different data sets and didn't see this error. Since you figured out what was wrong with your data I was hoping you could let me know how you did that. Here's the error I get.
Exception in thread "main" net.sf.picard.PicardException: Value was put into PairInfoMap more than once. 1: L3:MWR-PRG-0014:74:C0E94ACXX:3:1206:11809:158670 |
![]() |
![]() |
![]() |
#5 |
Senior Member
Location: USA Join Date: Apr 2010
Posts: 102
|
![]()
Hi ginolhac,
I encountered the same problem as you when i tried to use MarkDuplicate command and when i looked at the problematic read i found that the Mapping Quality of those two reads were more than 25. Then how do we remove those reads? Thanks in advance for your help...... |
![]() |
![]() |
![]() |
#6 |
Junior Member
Location: Copenhagen Join Date: Oct 2010
Posts: 4
|
![]()
He,
actually the issue came from fastq files that were not in sync. Some reads were missing at the end of one of the file. That explained those reads with a (null) name. To remove those, I used: Code:
samtools view -h file.bam | grep -v null | samtools view -bS - > file_clean.bam |
![]() |
![]() |
![]() |
#7 |
Junior Member
Location: Perth, Western Australia Join Date: Mar 2013
Posts: 6
|
![]()
I encountered the same error using picard tools MarkDuplicates and it was related to the alignment I had done using BWA (BWA MEM).
I had failed to use the -M option when running the alignment which enables compatibility with picard-tools MarkDuplicates function. I went back and re-ran the alignment with that option and it fixed the error. From the BWA manual site: -M Mark shorter split hits as secondary (for Picard compatibility). |
![]() |
![]() |
![]() |
#8 |
Member
Location: Philadelphia Join Date: Jan 2012
Posts: 58
|
![]()
I have been struggling with this issue. I have sample data merged from a Illumina PE runs. When trying to find other information/solutions it was suggested to modify the read group ID to include lane or run identification and then re-merge.
I have done that, but I still receive this error. Has anyone been able to resolve this issue? I could try to remove the offending read, but Im concerned there will be many more after. |
![]() |
![]() |
![]() |
#9 | |
Senior Member
Location: St. Louis Join Date: Dec 2010
Posts: 535
|
![]() Quote:
|
|
![]() |
![]() |
![]() |
#10 |
Member
Location: Philadelphia Join Date: Jan 2012
Posts: 58
|
![]()
Ah I am having issues with:
Code:
Exception in thread "main" net.sf.picard.PicardException: Value was put into PairInfoMap more than once. 1: E0005-FGC0298:HWI-ST970:298:C0MUAACXX:4:1201:13786:41745 at net.sf.picard.sam.CoordinateSortedPairInfoMap.ensureSequenceLoaded(CoordinateSortedPairInfoMap.java:124) at net.sf.picard.sam.CoordinateSortedPairInfoMap.remove(CoordinateSortedPairInfoMap.java:78) at net.sf.picard.sam.DiskReadEndsMap.remove(DiskReadEndsMap.java:61) at net.sf.picard.sam.MarkDuplicates.buildSortedReadEndLists(MarkDuplicates.java:418) at net.sf.picard.sam.MarkDuplicates.doWork(MarkDuplicates.java:161) at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:177) at net.sf.picard.sam.MarkDuplicates.main(MarkDuplicates.java:145) EDIT: There must be something greater at work here. I cannot even run ValidateSamFile without running into this error... Last edited by bwubb; 07-10-2013 at 10:02 AM. |
![]() |
![]() |
![]() |
#11 |
Member
Location: Barcelona Join Date: Feb 2012
Posts: 49
|
![]()
Hi All,
I have the same problem with BWA mem. I used -M option but still I get: Code:
.PicardException: Value was put into PairInfoMap more than once. 1: null:M00840:39:000000000-A5TE9:1:2103:11538:25521 Code:
samtools view -h before.bam | grep -v null | samtools view -bS - > cleaned.bam With BWA aln everthing is ok, but it's not recommened for my data since reads are ~251 bases long. Did anyone solve this problem? |
![]() |
![]() |
![]() |
#12 | |
Junior Member
Location: Groningen Join Date: Sep 2014
Posts: 2
|
![]() Quote:
|
|
![]() |
![]() |
![]() |
#13 |
Senior Member
Location: Ottawa Join Date: Apr 2011
Posts: 130
|
![]() |
![]() |
![]() |
![]() |
#14 |
Junior Member
Location: Beijing Join Date: Nov 2011
Posts: 4
|
![]()
JezSupreme and AdrianP are right!
The BWA-MEM algorithm performs local alignment. It may produce multiple primary alignments for different part of a query sequence. This is a crucial feature for long sequences. However, some tools such as Picard’s markDuplicates does not work with split alignments. One may consider to use option -M to flag shorter split hits as secondary. |
![]() |
![]() |
![]() |
Tags |
error, gatk, markduplicates, picard |
Thread Tools | |
|
|