Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SA tag in SAM file for chimeric reads

    Hi,
    I am using BWA-mem for split read alignment for my single end genomic DNA-seq from Illumina. I know that BWA uses SA tag for marking chimeric reads. When I manually BLAST individual reads with the SA tag I can clearly verify that they are indeed chimeras. However, I could not find details about the SA tag itself. What information is encoded in the SA field? I am posting an example of a chimeric read that maps to two separate genomic locations within the same contig (scf7180000067989)


    HWI-ST387:139:C03WJABXX:5:2108:15315:193815 16 scf7180000067989 85156 60 60M41S * 0 0 TTGAAGTCAAGAAAGTGGTAAAGAGAGATTAATAGGGGTATCTCAGCTACAACAAATATTATATTAAATTAAATGGTTAATCTTGCTTTGCTCACCATAAA * NM:i:2 MD:Z:31G1C26 AS:i:50 XS:i:0 SA:Z:scf7180000067989,85273,-,54S47M,60,1;

    HWI-ST387:139:C03WJABXX:5:2108:15315:193815 272 scf7180000067989 85273 60 54H47M * 0 0 AATATTATATTAAATTAAATGGTTAATCTTGCTTTGCTCACCATAAA * NM:i:1 MD:Z:11T35 AS:i:42 XS:i:22 SA:Z:scf7180000067989,85156,-,60M41S,60,2;

    I am expecting a lot of genome rearrangements in the sample, so ultimately I want to isolate these reads that map to variant locations and identify the regions of microhomology, which could help identify the breakpoint. I am new to Bioinformatics so any inputs would be great.

    Thanks in advance!

  • #2
    It is defined in the SAM tags specifications document which was split out from the main SAM specs at the end of last year.

    Note that bwa can write split alignments in which chimeric segments overlap (eg: alignments of 60M40S, and 50S50M for the same read)

    >I am expecting a lot of genome rearrangements in the sample, so ultimately I want to isolate these reads that map to variant locations and identify the regions of microhomology, which could help identify the breakpoint. I am new to Bioinformatics so any inputs would be great.

    I would recommend using one of the many (50+ at my last count) structural variant callers available that are designed to identify breakpoints. In terms of bwa SA tags, LUMPY is a caller that combines the bwa split read aligments with read pair and coverage information for breakpoint calling. If you're not happy to let the caller perform it's own split read analysis, then I would recommend GRIDSS (disclaimer: my caller) due to the lower false discovery rater compared to other callers. Other callers performing decently in my benchmarking results are manta, CREST and DELLY (with Pindel quite good if your focus is sensitivity).
    Last edited by dcameron; 10-30-2016, 08:33 PM.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Essential Discoveries and Tools in Epitranscriptomics
      by seqadmin


      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
      Yesterday, 07:01 AM
    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    39 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    41 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 09:21 AM
    0 responses
    35 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-04-2024, 09:00 AM
    0 responses
    55 views
    0 likes
    Last Post seqadmin  
    Working...
    X