Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Illumina Mate Pair prep

    Hello,

    I was writing to ask a question of those who have, seemingly successfully, worked with Illumina mate pair data. I have seen other threads in which people mention that reverse complementing the reads is a necessary prerequisite to ensure the reads are facing the correct direction. Is this true? If so, is there a tool that someone could recommend to easily reverse complement a fastq or sequence.txt file?

    If you have any other suggestions for dealing with mate pair data and have suggestions of tools to accomplish those tasks, I'd greatly appreciate it.

    Cheers,
    John

  • #2
    Use the fastx-toolkit. It has a great reverse-complement script that works very well. And yes if you want to analyze in a pseudo-paired-end analysis with aligners like BWA this is a needed step. Seems to work pretty well, also tried using Novoalign which doesn't require the reverse-complement step but is very slow in comparison and only marginally improves the outcome.

    Comment


    • #3
      I'd like to be sure I understand the orientation of paired-end reads for Illumina.

      Imagine you ignored the middle and wanted to concatenate the paired-ends together so that they read 5'-3' from the same strand. You leave PE1 as it is. Do you concatenate the reverse or the reverse compliment of PE2?

      The help would be great. Thanks all.

      Comment


      • #4
        Originally posted by Protaeus View Post
        I was writing to ask a question of those who have, seemingly successfully, worked with Illumina mate pair data. I have seen other threads in which people mention that reverse complementing the reads is a necessary prerequisite to ensure the reads are facing the correct direction...
        Whether or not you need to rev-comp one of the mate reads would depend on what alignment program you are using. I can tell you that Bowtie can deal with the sequences directly, you just have to tell it what to expect. There are three options you can use with bowtie:

        Code:
        --fr  Appropriate for Illumina paired end reads (default)
        --rf  Appropriate for Illumina mate paired reads
        --ff  Appropriate for SOLiD paired reads (becomes default when you specify colorspace reads using the -C option)
        .
        If you plan to use bowtie to align your mate paired reads you do not have to rev-comp anything, just use the --rf option when you run bowtie.

        Comment


        • #5
          any clue if I would need to do so with bfast?

          Comment


          • #6
            Great, the assembly programs know how to handle Illumina PE reads. But how should I understand them when I see two paired end reads on a screen:

            >PE1
            CTTACCCC
            >PE2
            ACCTAAAA

            If one strand of one insert were like a sentence, do I read it from left to right on PE1, skip over the unknown stuff, and finish the insert from right to left on PE2 (like this)?

            CTTACCCC-----------------------AAAATCCA

            or is it actually the rev-com of PE2 (like this)?

            CTTACCCC-----------------------TTTTAGGT

            Thanks!

            Protaeus - I apologize if my question belongs in another post! Thanks for bringing up PE orientation today.

            Comment


            • #7
              Hey Bronnyd,

              No problem, though hopefully the responders make it clear to whom they're responding! The mate pair prep and the paired end prep are quite different. In general, I don't think you have to worry about orientation with paired end data, at least with most alignment tools. The mate pair differs significantly in prep and results in some orientation issues that I think must be pre-processed prior to alignments. Guess we'll find out...

              John

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM
              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              25 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              28 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              24 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              52 views
              0 likes
              Last Post seqadmin  
              Working...
              X