Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Can you tell recombination/conversion from SAM flags?

    Hi all,

    I have tried for some time now to figure out if I can detect recombination or gene conversion using discordantly mapping reads and the FLAG information Bowtie2 outputs.

    It seems like flags of 97, 145, 161 and 81 are discordant because the paired reads map the incorrect distance apart, 65 and 129 mean the reads mapped to different contigs, and 113 and 177 mean inversions. Is this correct?

    Note: my information comes from this handy website: http://picard.sourceforge.net/explain-flags.html

    tl;dr
    If I look for paired reads with flags of 97, 145, 161 or 81 where both reads map to the same contig, is this a reasonable way to look for gene conversion or recombination events?

    thanks everyone!

  • #2
    Originally posted by bioinformer View Post
    ..., 65 and 129 mean the reads mapped to different contigs, .... Is this correct?
    No. The FLAG (on its own) doesn't tell you if the two paired reads map to the same contig or different contigs - just if they are mapped somewhere.

    Comment


    • #3
      Thanks for the response. But then what is the difference between:

      97
      * * read paired
      * * mate reverse strand
      * * first in pair

      -and-

      65
      * * read paired
      * * first in pair

      I reasoned that since they're odd numbers we know both reads mapped somewhere so the lack of information about the mate in 65/129 means that it mapped to a different contig since forward/reverse lose their meaning across non-assembled contigs.

      What do SAM flags of 65 or 129 mean and how are they different from 97/145/161/81?

      thanks so much

      Comment


      • #4
        Originally posted by bioinformer View Post
        Thanks for the response. But then what is the difference between:

        97
        * * read paired
        * * mate reverse strand
        * * first in pair
        This read is on the forward strand, --->, but the partner is on the reverse strand <---, which assuming they are mapped to the same contig is normal for Illumina and Sanger reads.

        The FLAG does not tell you if the two reads are mapped to the same contig. You could have one mapped to the forward strand of contig123, and one mapped to the reverse strand of contig456.
        Originally posted by bioinformer View Post
        -and-

        65
        * * read paired
        * * first in pair
        Both this read and its partner are on the forward strand of whatever contig they are mapped too. Mapping on the same strand on the same contig would be normal for Roche 454 paired end data except than many pipelines flip things to make this look like Sanger/Illumina where the partners should be on opposite strands.

        Comment


        • #5
          One of the columns of the .sam tells you where the mate mapped. If that column doesn't have an "=", and both reads mapped (which you need to determine from the flags), then the mate is on a different chromosome. So you can use samtools view | awk to get those reads.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin




            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
            Yesterday, 07:01 AM
          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          59 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          57 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          48 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          55 views
          0 likes
          Last Post seqadmin  
          Working...
          X