Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • isolate reads from a paired-ended fastq files

    hi folks,
    I have pair ended fastq files (let's call them 01_R1 and 01_R2). I need to make a new pair of fastq files that contains just the reads that fulfill this requirement: the reads of R2 should be within a certain range (let's say no more than 100bp far) from the respective read in R1.
    any suggestion on how to approach this?

  • #2
    Hello,

    I would say that most of the aligners have this option. Which aligner do you want to use?

    Comment


    • #3
      BOWTIE2 would be great! thanks for the help!

      Comment


      • #4
        with Bowtie2, the (very extensive) documentation mentions the following options :

        --al-conc <path>
        Write paired-end reads that align concordantly at least once to file(s) at <path>. These reads correspond to the SAM records with the FLAGS 0x4 bit unset and either the 0x40 or 0x80 bit set (depending on whether it’s mate #1 or #2). .1 and .2 strings are added to the filename to distinguish which file contains mate #1 and mate #2


        So as output you will have 2 fastq files that contain the reads that align concordantly to your options.

        For the range between pairs, take a look at --maxins option.

        hope it will help

        aubin

        Comment


        • #5
          thanks for the help good sir, if I understand correctly --al-conc <path> will produce a new couple of fastq that fulfil the --maxins option. the maxins command will take just reads that are spaced to a maximum number of base pairs in between. the problem I have is that the second read should have the opposite orientation of the first read to be selected by maxins (if I understand correctly), is there a way to not have this option selected? in other words I need to isolate reads that are within a certain range of the other direction, no matter the orientation (both should be considered valid).

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM
          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          22 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          24 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          19 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          50 views
          0 likes
          Last Post seqadmin  
          Working...
          X