Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • isolate reads from a paired-ended fastq files

    hi folks,
    I have pair ended fastq files (let's call them 01_R1 and 01_R2). I need to make a new pair of fastq files that contains just the reads that fulfill this requirement: the reads of R2 should be within a certain range (let's say no more than 100bp far) from the respective read in R1.
    any suggestion on how to approach this?

  • #2
    Hello,

    I would say that most of the aligners have this option. Which aligner do you want to use?

    Comment


    • #3
      BOWTIE2 would be great! thanks for the help!

      Comment


      • #4
        with Bowtie2, the (very extensive) documentation mentions the following options :

        --al-conc <path>
        Write paired-end reads that align concordantly at least once to file(s) at <path>. These reads correspond to the SAM records with the FLAGS 0x4 bit unset and either the 0x40 or 0x80 bit set (depending on whether it’s mate #1 or #2). .1 and .2 strings are added to the filename to distinguish which file contains mate #1 and mate #2


        So as output you will have 2 fastq files that contain the reads that align concordantly to your options.

        For the range between pairs, take a look at --maxins option.

        hope it will help

        aubin

        Comment


        • #5
          thanks for the help good sir, if I understand correctly --al-conc <path> will produce a new couple of fastq that fulfil the --maxins option. the maxins command will take just reads that are spaced to a maximum number of base pairs in between. the problem I have is that the second read should have the opposite orientation of the first read to be selected by maxins (if I understand correctly), is there a way to not have this option selected? in other words I need to isolate reads that are within a certain range of the other direction, no matter the orientation (both should be considered valid).

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin


            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
            Yesterday, 07:01 AM
          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          55 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          51 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          45 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          55 views
          0 likes
          Last Post seqadmin  
          Working...
          X