Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BWA puzzle: Read fate is not deterministic?

    A single read, say Read_X, is separately contained in two different fastq files, say A.fq and B.fq.

    Both fastq files were aligned with BWA against three different indexes as follows:

    First against a "human repeats" index. Unmapped reads were then aligned to a "human rRNA" index. The new set of unmapped reads were next aligned to a "human genomic (hg19)" index.

    The bwa commands for all runs were:
    Code:
    bwa aln –l 0 –t 24
    bwa samse –n 1
    Here is the puzzle: BWA treats Read_X differently in the two runs. For A.fq, the read passes through the first two indexes and is caught at (i.e., found to be aligned to) the third index. For B.fq, the read is caught at the first index itself.

    What could be the possible reasons for this behavior? Ideally a given read should have the same fate if passed through the same indexes in the same order, or should it not? Below are the entries in the different SAM files.

    A.fq alignments:
    Code:
    1. "human repeats" SAM entry:
    Read_X  20  Repeat  32  0  43M  *  0  0  <rev comp seq snipped>  <rev qual snipped>  XT:A:R  NM:i:3  X0:i:0  X1:i:0  XM:i:3  XO:i:0  XG:i:0  MD:Z:27C2G4G7  XA:Z:Repeat, -308,43M,3;
    
    2. "human rRNA" SAM entry:
    Read_X  4  *  32  0  43M  *  0  0  <seq snipped>  <qual snipped>
    
    3. "human genomic (hg19)" SAM entry:
    Read_X  16  chr13  93985787  16  43M  *  0  0  <rev comp seq snipped>  <rev qual snipped>  XT:A:U  NM:i:1  X0:i:1  X1:i:5  XM:i:1  XO:i:0  XG:i:0  MD:Z:39C3
    B.fq alignment:
    Code:
    1. "human repeats" SAM entry:
    Read_X  16  Repeat  308  0  43M  *  0  0  <rev comp seq snipped>  <rev qual snipped>  XT:A:R  NM:i:3  X0:i:0  X1:i:0  XM:i:3  XO:i:0  XG:i:0  MD:Z:27C1C9C3  XA:Z:Repeat, -32,43M,3;

    Cheers,
    BN

  • #2
    Might need two more pieces to the puzzle:

    1) Show the full 4 lines for two samples of the read in fastq format (check for Ns)

    2) Show the full sam output, including the variable OPTional fields (field 12 and beyond).
    ___
    edit
    *** okay *** found the tags in your post.

    Both reads did get caught in the REPEAT phase. Read A is getting caught in repeat at "Repeat:32", Read B is at Repeat:308
    BWA will randomly pick an alignment in case of multiple alignments.
    Last edited by Richard Finney; 06-09-2011, 11:39 AM.

    Comment


    • #3
      Hi Richard, I think that solves it! Thank you.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM
      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      26 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      29 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      25 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      52 views
      0 likes
      Last Post seqadmin  
      Working...
      X