View Single Post
Old 09-03-2013, 09:03 PM   #1
Junior Member
Location: Qatar

Join Date: Jul 2012
Posts: 3
Default Mate-pair information damaged on file merge

Hi all,

I'm trying to merge WGS data for one sample that's been run across 6 lanes.
Now the six individual files all validate (with Picard ValidateSamFile) correctly; the problem comes when I try to merge them.

I originally used Picard MergeSamFiles to merge them but this screwed with the mate-pairs ; then I recently found that most of the GATK pipeline will take multiple input files and merge them into a single output, so I tried this at the step of the GATK IndelRealigner, which I understand corrects mate-pair errors.

Both Picard MergeSamFiles and GATK IndelRealigner have messed up the mate-pair information during the merge. So I wondered if anyone knew of a way to merge the files whilst preserving the mate pair information.

I should point out that I've tried correcting this (4 times) with Picard FixMateInformation and each time it's run into various problems after about 18 hours.

the commands I used were:

java -Xmx24g -Xms24g -jar ~/apps/GATK/GenomeAnalysisTK.jar \
-I nodup17_01.bam \
-I nodup17_02.bam \
-I nodup17_03.bam \
-I nodup17_04.bam \
-I nodup17_05.bam \
-I nodup17_06.bam \
-R ~/gatk/ucsc.hg19.fasta \
-T IndelRealigner \
-targetIntervals nodup17.intervals \
-o realign17.bam \
-nt 12 \
-known ~/gatk/Mills_and_1000G_gold_standard.indels.hg19.vcf \
-known ~/gatk/1000G_phase1.indels.hg19.vcf \
-known ~/gatk/dbsnp_135.hg19.vcf

java -Xmx24g -Xms24g -jar ~/apps/picard/MergeSamFiles.jar \
I=nodup17_01.bam \
I=nodup17_02.bam \
I=nodup17_03.bam \
I=nodup17_04.bam \
I=nodup17_05.bam \
I=nodup17_06.bam \
o=realign17.bam \
SO=coordinate \
TMP_DIR=~/local17/ \

java -Xmx24g -Xms24g -jar ~/apps/picard/FixMateInformation.jar \
INPUT=realign17.bam \
OUTPUT=realign17_FM.bam \
SO=coordinate \
TMP_DIR=local17/ \

biscuit13161 is offline   Reply With Quote