Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Use SAM file to pull reads from FASTQ

    Hi Folks,

    I have a SAM file with unpaired reads (originally from a FASTQ) and I would like to use it to pull the read and its pair from the FASTQ file - does anyone know if there is a script out there to do this?

    I have used the Picard tools SamToFastq but to my knowledge there is not a script in Picard or SamTools to do exactly what I described here (or maybe there is and I just haven't found it!).

    Thank you!

  • #2
    If the reads are unpaired, how can you pull their mate?

    Juts work out the read names you desire, and write a short script to fish those reads out.

    Comment


    • #3
      ^ good point...i have a feeling there is some miswording in the question.

      I'd try to truncate the file through some form of filtering (maybe samtools or bamtools) and then use one of the sam/bam to fastq conversion scripts.
      /* Shawn Driscoll, Gene Expression Laboratory, Pfaff
      Salk Institute for Biological Studies, La Jolla, CA, USA */

      Comment


      • #4
        Yes, sorry for the unclear wording. The SAM file is a result of mapping paired end reads to a reference. I have a SAM file with mapped mated pairs that I was able to convert to a FASTQ which worked great. But I also have a SAM file with mapped unmated pairs - it is with this file that I would like to use to pull the reads that mapped (but their "mate" did not) and their pair from the original FASTQ files.

        Ideally the output would be these pairs in a FASTQ file.

        Comment


        • #5
          You should be able to extract those alignments as long as the aligner you used set the flags right. The unmapped mates will have 0x4 set and the mapped mates should have 0x8 set. You might need to name sort the bam first but then you could pull out only those reads with this:
          Code:
          samtools view -f 0xC -b alignments.bam > singletons.bam
          Then convert that bam file into fastq.
          /* Shawn Driscoll, Gene Expression Laboratory, Pfaff
          Salk Institute for Biological Studies, La Jolla, CA, USA */

          Comment


          • #6
            Actually that will also pull out all unaligned reads in addition to your singleton alignments. So more filtering will be necessary. Pairs make this tricky because the SAM annotation of pairs is messy.
            /* Shawn Driscoll, Gene Expression Laboratory, Pfaff
            Salk Institute for Biological Studies, La Jolla, CA, USA */

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM
            • seqadmin
              Techniques and Challenges in Conservation Genomics
              by seqadmin



              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

              Avian Conservation
              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
              03-08-2024, 10:41 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 06:37 PM
            0 responses
            7 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, Yesterday, 06:07 PM
            0 responses
            7 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-22-2024, 10:03 AM
            0 responses
            49 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-21-2024, 07:32 AM
            0 responses
            66 views
            0 likes
            Last Post seqadmin  
            Working...
            X