Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Mate reads to fastq

    Hi,

    I am trying to extract mate reads out of a sam file with following flags: view -b -S -f 8 -F 4 output.sam > mate.bam

    and then with bam2fastq: bam2fastq -o mate#.fastq -f mate.bam

    Unfortunetly i get this error message:


    This looks like paired data from lane 0.
    Output will be in unmapped_1.fastq and unmapped_2.fastq
    1 sequences in the BAM file
    1 sequences exported
    WARNING: 1 reads could not be matched to a mate and were not exported


    Probably someone can help me out?

    Thanks,
    TOmi

  • #2
    Is there really just one read in the original SAM file, and therefore just 1 in the BAM file?

    Comment


    • #3
      Hi,

      thank you for your reply.
      No both reads are in the sam file (reverse and forward).

      Greetings,
      Tomi

      Comment


      • #4
        Could you share the (start of the) SAM file then? If you post it here, use the [ code ] tags - e.g. via the # icon on the advanced editor.

        Comment


        • #5
          BWA Sampe Output

          Code:
          @SQ     SN:gi|157704448|ref|AC_000133.1|        LN:219475005
          @PG     ID:bwa  PN:bwa  VN:0.5.9-r16
          testSample_0_1  73      gi|157704448|ref|AC_000133.1|   1       37      75M     =       1       0       ATTGACAAGGGGAGGGAAAAGAGGAACAGAAATTCTTTTCTAT$
          testSample_0_2  133     gi|157704448|ref|AC_000133.1|   1       0       *       =       1       0       ATACCCAGGATTTTACCTGTAAAAGTACCCTCAGGTCGTGATT$
          testSample_1_1  77      *       0       0       *       *       0       0       ATCGTCAATAGGGTACTACTTCCATAATTTTGTAAATCCGCATGTTCCTCGAATAATAACGTGGAAC$
          testSample_1_2  141     *       0       0       *       *       0       0       AATGGAAGACCAGATGATTCTACTATATGACTCCAGACTAAGACAAATGCTAGGCTTTGAAAACGCA

          Comment


          • #6
            My guess is that the program is picky about the read names. You'd have to check the source itself to be sure.

            Either the name of read one and read 2 should be the same, or read 1 should end in '/1', and read 2 in '/2'. Try those. The Picard suite also has a program to turn bams to fastqs.

            Comment


            • #7
              As swbarnes2 points out, for a SAM/BAM file both parts of a pair of reads should be recorded with the same template name in column 1 (the suffix /1 or /2 or whatever can optionally be recorded in the tags). The FLAG in column 2 specifies which is the first read and which is the second.

              Your filtered SAM file has four unique identifiers in column 1, therefore no complete pairs.

              Double check the filter options you using with samtools view...

              Comment


              • #8
                Hi, thank you very much for your replies.

                Yes, indeed the names were wrong, so I corrected it.
                Basically I created a sample where I took the first 75bp of a reference chromosome, then I skipped a specific insert size and took the next 75, made the reverse and then the complementery strand out of it.

                Fortunately I get the correct flags for the mapped reads, but I am still not able to export the reads where one is mapped and the second not - I mutated the second strand with a high mutation rate, just to make sure, that he can't map it.

                Do you have an idea why? Is this not possible? I tried it with sam tools like that:
                samtools view -b -S -f 8 -F 4 output.sam > mate.bam

                and then with bam2fastq: bam2fastq -o mate#.fastq -f mate.bam

                He is saying the correct number of mate reads (in that case one), but he still gets the warning I described above.


                Here again the output of sam:
                Code:
                @SQ     SN:gi|157704448|ref|AC_000133.1|        LN:219475005
                @PG     ID:bwa  PN:bwa  VN:0.5.9-r16
                testSample_0    73      gi|157704448|ref|AC_000133.1|   1       37      75M     =       1       0       ATTGACAAGGGGAGGGAAAAGAGGAACAGAAATTCTTTTCTAT$
                testSample_0    133     gi|157704448|ref|AC_000133.1|   1       0       *       =       1       0       TGGCTCTAACAGGCCACGATGGAATAGTCAATAATCACCTCTT$
                testSample_1    99      gi|157704448|ref|AC_000133.1|   76      60      75M     =       226     225     AAATCCAGTTTGTGCCTACGGACATAATCTTTGAATTTGCTTT$
                testSample_1    147     gi|157704448|ref|AC_000133.1|   226     60      75M     =       76      -225    AATAGATTTTCAAATAAGAAAATGAGAGGACATGAGCTTGAGG$
                testSample_2    99      gi|157704448|ref|AC_000133.1|   301     60      75M     =       451     225     CTGACGACCTCCACGTGATTTCAACAATGATTTCAAATATTTC$
                testSample_2    147     gi|157704448|ref|AC_000133.1|   451     60      75M     =       301     -225    TATAATCTATTGGCCATTCACAGCATAGCGTATAAACCTAGCT$
                Thank you very much

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Non-Coding RNA Research and Technologies
                  by seqadmin




                  Non-coding RNAs (ncRNAs) do not code for proteins but play important roles in numerous cellular processes including gene silencing, developmental pathways, and more. There are numerous types including microRNA (miRNA), long ncRNA (lncRNA), circular RNA (circRNA), and more. In this article, we discuss innovative ncRNA research and explore recent technological advancements that improve the study of ncRNAs.

                  Nobel Prize for MicroRNA Discovery
                  This week,...
                  10-07-2024, 08:07 AM
                • seqadmin
                  Recent Developments in Metagenomics
                  by seqadmin





                  Metagenomics has improved the way researchers study microorganisms across diverse environments. Historically, studying microorganisms relied on culturing them in the lab, a method that limits the investigation of many species since most are unculturable1. Metagenomics overcomes these issues by allowing the study of microorganisms regardless of their ability to be cultured or the environments they inhabit. Over time, the field has evolved, especially with the advent...
                  09-23-2024, 06:35 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Today, 07:29 AM
                0 responses
                14 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 10-15-2024, 06:35 AM
                0 responses
                11 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 10-14-2024, 02:44 PM
                0 responses
                12 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 10-11-2024, 06:55 AM
                0 responses
                19 views
                0 likes
                Last Post seqadmin  
                Working...
                X