Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Searching for paired reads around a location

    Hi folks,

    I'm looking at the 1000 genomes data using samtools. So far the flag filter has been great in helping me locate regions of interest, but for what I want to do next, I'm not sure this is the easiest/best option... Is there anyway to search for paired end reads that span a specific location, ie, one read would be to the left of the location in question, and the other read would be to the right of the location in question? Both reads would have to map properly. I suspect that at certain locations individuals are polymorphic for an insert, and the presence/absence of such reads as described above would confirm/deny this.

    Apologies if this has already been addressed - I suspect I may not be searching using the correct terminology as I find it hard to believe I'm the only person interested in this!

    All help gratefully received!

    A

  • #2
    Let's say your center point is 1000.

    samtools view -f 34 -r chr:500-1000 my.bam > forward.sam
    samtools view -f 18 -r chr:1001-1500 my.bam > reverse.sam

    a flag of 34 = properly paired, and the mate is reversed, meaning the read itself is forward, since its properly paired. So it's left of your center point. A flag of 18 means properly paired, and reversed, and those fall after the center point. You want the reads whose names appear on both lists.

    Comment


    • #3
      Many thanks for your prompt response, swbarnes2. I'd gotten as far as having to do 2 searches using :

      samtools view -f 3 -F 1564 my.bam chr:500-1000 > forward.sam
      samtools view -f 19 -F 1548 my.bam chr:1001-1500 > reverse.sam

      which I believe does the same as what you've described above(?). I thought there may be some more sophisticated way of doing this that doesn't require you having to compare 2 files for each position, but I guess not.

      All the best,

      A

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin




        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
        04-22-2024, 07:01 AM
      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Yesterday, 08:47 AM
      0 responses
      15 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      60 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      60 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      54 views
      0 likes
      Last Post seqadmin  
      Working...
      X