Dear all
Hi,
I'm using BWA in the parallel environment.
Now, I have tested two pattern fastq files for query, a multi-fastq file(a) and parsed multi-fastq files(b).
(a) Include 5 million reads and (b) include a million reads(total 5 file) , and (a) is same as the (b) merged.
In this case, I have problem that some mapped results are different between (a) and (b).(following description)
I want to execute Indel/SNP Calls by GATK that uses BWA alignment.
Please tell me , if you have common or better solution for this different reads case
( for example ,filtering from tag in sam file and analyse SNPs or etc).
* Input file: SRR015926
* Command:
bwa aln -t 2 <in.db.fasta> SRR015926_1.fastq > 1.sai
bwa aln -t 2 <in.db.fasta> SRR015926_2.fastq > 2.sai
bwa sampe <in.db.fasta> 1.sai 2.sai ${read1} ${read2} > out.sam
* <in.db.fasta>:used Human Genome(UCSC hg19)
(a) second read is ummapped
SRR015926.3556842 73 chr14 50441185 37 34M
= 50441185 0 AAAGGAAGATTCCTGAATCCTTCGTTTGAGCACC
FII+/6;I2II38I.+67.34*.7&41(&%'-&( XT:A:U NM:i:0 SM:i:37 AM:i:0
X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:34
SRR015926.3556842 133 chr14 50441185 0 *
= 50441185 0 AAAAAAGAAGAGAAGAAAGGAAAGACACCCAA
&%&#'##%$#$%*&$$$%$$#$'+,$&%#$+4
(b) second read is mapped
SRR015926.3556842 99 chr14 50441185 29 34M
= 50441295 133 AAAGGAAGATTCCTGAATCCTTCGTTTGAGCACC
FII+/6;I2II38I.+67.34*.7&41(&%'-&( XT:A:U NM:i:0 SM:i:29 AM:i:29
X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:34
SRR015926.3556842 147 chr14 50441295 29 9S23M
= 50441185 -133 TTGGGTGTCTTTCCTTTCTTCTCTTCTTTTTT
4+$#%&$,+'$#$$%$$$&*%$#$%##'#&%& XT:A:M NM:i:3 SM:i:29 AM:i:29
XM:i:3 XO:i:0 XG:i:0 MD:Z:4T6T4T6
Best regards,
koji
Hi,
I'm using BWA in the parallel environment.
Now, I have tested two pattern fastq files for query, a multi-fastq file(a) and parsed multi-fastq files(b).
(a) Include 5 million reads and (b) include a million reads(total 5 file) , and (a) is same as the (b) merged.
In this case, I have problem that some mapped results are different between (a) and (b).(following description)
I want to execute Indel/SNP Calls by GATK that uses BWA alignment.
Please tell me , if you have common or better solution for this different reads case
( for example ,filtering from tag in sam file and analyse SNPs or etc).
* Input file: SRR015926
* Command:
bwa aln -t 2 <in.db.fasta> SRR015926_1.fastq > 1.sai
bwa aln -t 2 <in.db.fasta> SRR015926_2.fastq > 2.sai
bwa sampe <in.db.fasta> 1.sai 2.sai ${read1} ${read2} > out.sam
* <in.db.fasta>:used Human Genome(UCSC hg19)
(a) second read is ummapped
SRR015926.3556842 73 chr14 50441185 37 34M
= 50441185 0 AAAGGAAGATTCCTGAATCCTTCGTTTGAGCACC
FII+/6;I2II38I.+67.34*.7&41(&%'-&( XT:A:U NM:i:0 SM:i:37 AM:i:0
X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:34
SRR015926.3556842 133 chr14 50441185 0 *
= 50441185 0 AAAAAAGAAGAGAAGAAAGGAAAGACACCCAA
&%&#'##%$#$%*&$$$%$$#$'+,$&%#$+4
(b) second read is mapped
SRR015926.3556842 99 chr14 50441185 29 34M
= 50441295 133 AAAGGAAGATTCCTGAATCCTTCGTTTGAGCACC
FII+/6;I2II38I.+67.34*.7&41(&%'-&( XT:A:U NM:i:0 SM:i:29 AM:i:29
X0:i:1 X1:i:0 XM:i:0 XO:i:0 XG:i:0 MD:Z:34
SRR015926.3556842 147 chr14 50441295 29 9S23M
= 50441185 -133 TTGGGTGTCTTTCCTTTCTTCTCTTCTTTTTT
4+$#%&$,+'$#$$%$$$&*%$#$%##'#&%& XT:A:M NM:i:3 SM:i:29 AM:i:29
XM:i:3 XO:i:0 XG:i:0 MD:Z:4T6T4T6
Best regards,
koji