SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
samtools question - redefine mate information after applying filters caswater Bioinformatics 0 03-07-2012 07:37 PM
get mate pair info from xml file c_ro87 Bioinformatics 0 02-22-2012 07:10 AM
Help! lost data in fastq file eilosei Illumina/Solexa 6 11-11-2011 01:06 PM
tracking Illumina read pair information SES Bioinformatics 1 10-18-2011 06:40 AM
Difference between mate pair and pair end bassu General 2 06-19-2010 06:13 AM

Reply
 
Thread Tools
Old 09-03-2013, 08:03 PM   #1
biscuit13161
Junior Member
 
Location: Qatar

Join Date: Jul 2012
Posts: 3
Default Mate-pair information damaged on file merge

Hi all,

I'm trying to merge WGS data for one sample that's been run across 6 lanes.
Now the six individual files all validate (with Picard ValidateSamFile) correctly; the problem comes when I try to merge them.

I originally used Picard MergeSamFiles to merge them but this screwed with the mate-pairs ; then I recently found that most of the GATK pipeline will take multiple input files and merge them into a single output, so I tried this at the step of the GATK IndelRealigner, which I understand corrects mate-pair errors.

Both Picard MergeSamFiles and GATK IndelRealigner have messed up the mate-pair information during the merge. So I wondered if anyone knew of a way to merge the files whilst preserving the mate pair information.

I should point out that I've tried correcting this (4 times) with Picard FixMateInformation and each time it's run into various problems after about 18 hours.

the commands I used were:

java -Xmx24g -Xms24g -jar ~/apps/GATK/GenomeAnalysisTK.jar \
-I nodup17_01.bam \
-I nodup17_02.bam \
-I nodup17_03.bam \
-I nodup17_04.bam \
-I nodup17_05.bam \
-I nodup17_06.bam \
-R ~/gatk/ucsc.hg19.fasta \
-T IndelRealigner \
-targetIntervals nodup17.intervals \
-o realign17.bam \
-nt 12 \
-known ~/gatk/Mills_and_1000G_gold_standard.indels.hg19.vcf \
-known ~/gatk/1000G_phase1.indels.hg19.vcf \
-known ~/gatk/dbsnp_135.hg19.vcf

java -Xmx24g -Xms24g -jar ~/apps/picard/MergeSamFiles.jar \
I=nodup17_01.bam \
I=nodup17_02.bam \
I=nodup17_03.bam \
I=nodup17_04.bam \
I=nodup17_05.bam \
I=nodup17_06.bam \
o=realign17.bam \
USE_THREADING=true \
SO=coordinate \
TMP_DIR=~/local17/ \
CREATE_INDEX=true \
CREATE_MD5_FILE=true

java -Xmx24g -Xms24g -jar ~/apps/picard/FixMateInformation.jar \
INPUT=realign17.bam \
OUTPUT=realign17_FM.bam \
SO=coordinate \
TMP_DIR=local17/ \
CREATE_INDEX=true \
CREATE_MD5_FILE=true

Thanks
Richard
biscuit13161 is offline   Reply With Quote
Reply

Tags
gatk, mate-pair reads, picard

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:11 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO