Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Orientation of mate-pairs

    It seems our MiSeq mate paired data isn't orientated as 5'--3 3'---5'

    In a sequence that would be
    5'-------------3'
    We see

    5'----- 3'-----

    Additionally each FASTQ file is mixed with both mate pair ends.
    Is there a tool which will reverse one of these fastq files to get

    5'----- -----3'

    ??

    Thanks,
    J

  • #2
    How did you prepare your library? What file are you looking at? What application are these being used for?

    Do you really mean mate pairs, which in the Illumina world & Ion Torrent world, these are prepared by circularizing DNA and removing a large portion of it, or paired ends (the more conventional reads generated from Nextera or TruSeq libraries with no circularization step)

    Standard FASTQ files from paired end data report each read in its forward orientation. FLASH is an excellent tool for merging them if they overlap. If they don't overlap, I think in general you are better off leaving them separate (not reverse-complementing one & adding a string of N), as most tools can use that more profitably.

    Depending on how you prepared your template, the reads may or may not be expected to be oriented relative to some reference. An aligner (e.g. Bowtie, BWA) will align them in either direction to the reference.

    Comment


    • #3
      Sorry, yes paired-ends, and directionality was not incorporated into the library prep.

      The paired ends are either end of a short PCR amplicon.
      Reverse complementing the reverse (second) sequence would work, however because no directionality was employed both Fastq files contain 5'--3' and 3'--5' sequences. Essentially what I need to do is separate sequences based on priming sequence (i.e. forward and reverse). Then I can reverse complement all reverse 3'---5' sequences.

      Is there a tool to filter out sequences based on a known string of bases?

      Cheers,

      J

      Comment


      • #4
        Do you think a simple Perl script using grep could do this?

        Comment


        • #5
          I'm not sure that you need to bother. bwa will just align them I don't think it will care if some of then are not oriented right.

          But that result is very strange. I'd stop and make sure that there isn't some serious error in your experiment or analysis, because it is not normal to have a whole lot of reads both in the same orientation.

          Comment


          • #6
            There are a number of tools for dealing with amplicon sequencing, though often with fusion primers (sequencing adapters built into primers, so everything is oriented). E.g. PANDA.

            If it is a short amplicon, then FLASH will merge the reads but won't help you with orienting.

            Trying to use grep/Perl regexp for this is problematic, as you will probably have errors. Aligners. But, since you have two bites on the apple you might attempt it (i.e. try matching first one end then the other to determine orientation. I would just use an aligner to solve the problem.

            Comment


            • #7
              Thanks for the input.
              From the same MiSeq run I have RADseq data that produced two fastq files; one of which contained all forward reads, and the second all reads in the same orientation that needed to be reverse complimented.

              The data that seems to be problematic used TrueSeq amplicon library prep..
              So each fastq file shows a mixture of orientated reads:

              Fastq1
              BARCODE--F. PRIMER--NNNN
              BARCODE--R.PRIMER--NNNN

              Fastq 2

              BARCODE--R.PRIMER--NNNN
              BARCODE--F. PRIMER--NNNN

              I'm guessing that the programs designed to join overlapping PE reads will not be able to take into consideration the mixture of Fwd and Rev sequences in each file?
              is this normal, or should I be seeing all Fwd and Rev primer sequences in separate Fastqs?

              Sorry if my question is repetitive/obvious to answer. New to Illumina data.

              Cheers,
              J

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Essential Discoveries and Tools in Epitranscriptomics
                by seqadmin




                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                04-22-2024, 07:01 AM
              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Today, 08:47 AM
              0 responses
              12 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              60 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              59 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              54 views
              0 likes
              Last Post seqadmin  
              Working...
              X