![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Error with MarkDuplicates in Picard | slowsmile | Bioinformatics | 13 | 11-01-2015 04:16 AM |
How to use Picard's MarkDuplicates | cliff | Bioinformatics | 12 | 01-26-2015 11:56 PM |
Error "RG ID on SAMRecord not found in header" from Picard's MarkDuplicates.jar | cliff | Bioinformatics | 4 | 11-10-2011 04:27 AM |
MarkDuplicates in picard | bair | Bioinformatics | 3 | 12-23-2010 12:00 PM |
Picard MarkDuplicates | wangzkai | Bioinformatics | 2 | 05-18-2010 10:14 PM |
![]() |
|
Thread Tools |
![]() |
#1 |
Senior Member
Location: Rochester, MN Join Date: Mar 2009
Posts: 191
|
![]()
I am trying to remove PCR duplicates using Picard.
My header looks like this: Code:
@HD VN:1.0 SO:sorted @PG ID:TopHat VN:1.0.14 CL:/home/guo/bin/tophat -G ../../hg19.GFF3 -g 1 -o Dex -r 160 --solexa1.3-quals -p 4 ../../../../bowtie-0.12.3/indexes/hg19 ../../FASTQ/KUMC_PE_RNASEQ_sample6_1_sequence.txt ../../FASTQ/KUMC_PE_RNASEQ_sample6_2_sequence.txt @SQ SN:chr1 LN:249250621 @SQ SN:chr2 LN:243199373 @SQ SN:chr3 LN:198022430 @SQ SN:chr4 LN:191154276 @SQ SN:chr5 LN:180915260 @SQ SN:chr6 LN:171115067 @SQ SN:chr7 LN:159138663 @SQ SN:chr8 LN:146364022 @SQ SN:chr9 LN:141213431 @SQ SN:chr10 LN:135534747 @SQ SN:chr11 LN:135006516 @SQ SN:chr12 LN:133851895 @SQ SN:chr13 LN:115169878 @SQ SN:chr14 LN:107349540 @SQ SN:chr15 LN:102531392 @SQ SN:chr16 LN:90354753 @SQ SN:chr17 LN:81195210 @SQ SN:chr18 LN:78077248 @SQ SN:chr19 LN:59128983 @SQ SN:chr20 LN:63025520 @SQ SN:chr21 LN:48129895 @SQ SN:chr22 LN:51304566 @SQ SN:chrX LN:155270560 @SQ SN:chrY LN:59373566 @SQ SN:chrM LN:16571 Code:
net.sf.picard.sam.MarkDuplicates INPUT=Dex.bam OUTPUT=Dex.NoDup.bam METRICS_FILE=Dex.Metrics REMOVE_DUPLICATES=true ASSUME_SORTED=false MAX_SEQUENCES_FOR_DISK_READ_ENDS_MAP=50000 READ_NAME_REGEX=[a-zA-Z0-9]+:[0-9]:([0-9]+):([0-9]+):([0-9]+).* OPTICAL_DUPLICATE_PIXEL_DISTANCE=100 TMP_DIR=/tmp/shart3 VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false INFO 2010-08-19 12:25:28 MarkDuplicates Start of doWork freeMemory: 31023560; totalMemory: 31588352; maxMemory: 620756992 INFO 2010-08-19 12:25:28 MarkDuplicates Reading input file and constructing read end information. INFO 2010-08-19 12:25:28 MarkDuplicates Will retain up to 2463321 data points before spilling to disk. [Thu Aug 19 12:25:28 CDT 2010] net.sf.picard.sam.MarkDuplicates done. Runtime.totalMemory()=51314688 Exception in thread "main" java.lang.IllegalArgumentException: No enum const class net.sf.samtools.SAMFileHeader$SortOrder.sorted Code:
46021417 in total 0 QC failure 0 duplicates 46021417 mapped (100.00%) 46021417 paired in sequencing 23145972 read1 22875445 read2 33385558 properly paired (72.54%) 39196142 with itself and mate mapped 6825275 singletons (14.83%) 0 with mate mapped to a different chr 0 with mate mapped to a different chr (mapQ>=5) |
![]() |
![]() |
![]() |
#2 |
Nils Homer
Location: Boston, MA, USA Join Date: Nov 2008
Posts: 1,285
|
![]()
The value for the sort order tag "SO" is sorted, whereas the SAM specification lists "unsorted", "queryname" or "coordinate" as allowable values. Picard validates SAM/BAM files, while samtools does not (as much).
Looks like a bug in tophat. |
![]() |
![]() |
![]() |
#3 | |
Senior Member
Location: Rochester, MN Join Date: Mar 2009
Posts: 191
|
![]() Quote:
Interestingly, I also ran SOAPals, which did the same thing when I converted it to SAM. Is there any RNA-Seq aligner that outputs these data in SAM? |
|
![]() |
![]() |
![]() |
#4 |
Member
Location: USA Join Date: Nov 2010
Posts: 56
|
![]()
I am having the same problem. RNA-seq data. Have tried even Splice map and the sam files are just crap cannot use it down stream. Did you find a way around this? If so kindly, let me know.
|
![]() |
![]() |
![]() |
#5 |
Senior Member
Location: Rochester, MN Join Date: Mar 2009
Posts: 191
|
![]()
We've switched to the updated Tophat2, which seems to work well.
|
![]() |
![]() |
![]() |
#6 |
Member
Location: USA Join Date: Nov 2010
Posts: 56
|
![]()
what version if you can add? I used 2.0.0.4 and my SAM/BAM files were not compatible with GATK. Same data aligned using SHRIMP, no problem works like a charm..
|
![]() |
![]() |
![]() |
#7 |
Senior Member
Location: Rochester, MN Join Date: Mar 2009
Posts: 191
|
![]() |
![]() |
![]() |
![]() |
Thread Tools | |
|
|