SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
RNA-Seq: Deterministic and Stochastic Allele Specific Gene Expression in Single Mouse Newsbot! Literature Watch 0 07-07-2011 03:00 AM
PubMed: High-definition mapping of retroviral integration sites defines the fate of a Newsbot! Literature Watch 0 01-05-2011 07:20 AM
puzzle in fq_all2std.pl biocc Bioinformatics 0 11-25-2010 04:33 AM
blast puzzle anyone1985 Bioinformatics 1 09-06-2009 02:45 AM
format puzzle anyone1985 Illumina/Solexa 0 03-13-2009 04:29 AM

Reply
 
Thread Tools
Old 06-09-2011, 11:31 AM   #1
bhootnaath
Junior Member
 
Location: NM

Join Date: Jul 2009
Posts: 5
Default BWA puzzle: Read fate is not deterministic?

A single read, say Read_X, is separately contained in two different fastq files, say A.fq and B.fq.

Both fastq files were aligned with BWA against three different indexes as follows:

First against a "human repeats" index. Unmapped reads were then aligned to a "human rRNA" index. The new set of unmapped reads were next aligned to a "human genomic (hg19)" index.

The bwa commands for all runs were:
Code:
bwa aln l 0 t 24
bwa samse n 1
Here is the puzzle: BWA treats Read_X differently in the two runs. For A.fq, the read passes through the first two indexes and is caught at (i.e., found to be aligned to) the third index. For B.fq, the read is caught at the first index itself.

What could be the possible reasons for this behavior? Ideally a given read should have the same fate if passed through the same indexes in the same order, or should it not? Below are the entries in the different SAM files.

A.fq alignments:
Code:
1. "human repeats" SAM entry:
Read_X  20  Repeat  32  0  43M  *  0  0  <rev comp seq snipped>  <rev qual snipped>  XT:A:R  NM:i:3  X0:i:0  X1:i:0  XM:i:3  XO:i:0  XG:i:0  MD:Z:27C2G4G7  XA:Z:Repeat, -308,43M,3;

2. "human rRNA" SAM entry:
Read_X  4  *  32  0  43M  *  0  0  <seq snipped>  <qual snipped>

3. "human genomic (hg19)" SAM entry:
Read_X  16  chr13  93985787  16  43M  *  0  0  <rev comp seq snipped>  <rev qual snipped>  XT:A:U  NM:i:1  X0:i:1  X1:i:5  XM:i:1  XO:i:0  XG:i:0  MD:Z:39C3
B.fq alignment:
Code:
1. "human repeats" SAM entry:
Read_X  16  Repeat  308  0  43M  *  0  0  <rev comp seq snipped>  <rev qual snipped>  XT:A:R  NM:i:3  X0:i:0  X1:i:0  XM:i:3  XO:i:0  XG:i:0  MD:Z:27C1C9C3  XA:Z:Repeat, -32,43M,3;

Cheers,
BN
bhootnaath is offline   Reply With Quote
Old 06-09-2011, 12:18 PM   #2
Richard Finney
Senior Member
 
Location: bethesda

Join Date: Feb 2009
Posts: 700
Default

Might need two more pieces to the puzzle:

1) Show the full 4 lines for two samples of the read in fastq format (check for Ns)

2) Show the full sam output, including the variable OPTional fields (field 12 and beyond).
___
edit
*** okay *** found the tags in your post.

Both reads did get caught in the REPEAT phase. Read A is getting caught in repeat at "Repeat:32", Read B is at Repeat:308
BWA will randomly pick an alignment in case of multiple alignments.

Last edited by Richard Finney; 06-09-2011 at 12:39 PM.
Richard Finney is offline   Reply With Quote
Old 06-09-2011, 12:50 PM   #3
bhootnaath
Junior Member
 
Location: NM

Join Date: Jul 2009
Posts: 5
Default

Hi Richard, I think that solves it! Thank you.
bhootnaath is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:12 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO