Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Orientation of 454 paired end reads split by linker

    Hi,

    I extracted reads from sff files

    Then I match these reads within titanium linker,

    1) Why only a small proportion of my reads can found linker? My library is 20kb

    2) After split by linker, I got a pair of reads. The orientation is f-><-r or f->f->?

    Thanks

  • #2
    Originally posted by skblazer View Post
    Hi,

    I extracted reads from sff files

    Then I match these reads within titanium linker,

    1) Why only a small proportion of my reads can found linker? My library is 20kb
    Your circularized DNAs should be ~ 20kbp which are then shattered into 500-800bp fragments. This means that there are far more fragments not containing the linker than those which do. The biotin binding is meant to enrich your fragment pool for the linker containing pieces, but unfortunately this sometimes the enrichment process is not very selective. This results in a lot of reads which do not contain the linker and thus are not paired end reads. I have seen very low percentages of true paired ends in some of our preps as well.

    2) After split by linker, I got a pair of reads. The orientation is f-><-r or f->f->?

    Thanks
    They will be in the f-> f-> orientation but their order relative to their genomic positions will be reversed. To illustrate:

    In the read call the two halves of the paired read L and R (left and right)
    Code:
    ================================^^^^^^^^^^^^^^^=======================
    Read-L                             linker      Read-R
    After removal of the linker, splitting the reads and aligning (or assembling) they should be oriented as such and the distance between them should be ~ 20kbp:

    Code:
    Read-R                                                   Read-L
    -------->                                                -------->
    ==================================================================
    Of course if the reads match the bottom strand of the reference they will be flipped around.

    Comment


    • #3
      Many thanks to your kindly help kmcarr.

      Originally posted by kmcarr View Post
      Your circularized DNAs should be ~ 20kbp which are then shattered into 500-800bp fragments. This means that there are far more fragments not containing the linker than those which do. The biotin binding is meant to enrich your fragment pool for the linker containing pieces, but unfortunately this sometimes the enrichment process is not very selective. This results in a lot of reads which do not contain the linker and thus are not paired end reads. I have seen very low percentages of true paired ends in some of our preps as well.



      They will be in the f-> f-> orientation but their order relative to their genomic positions will be reversed. To illustrate:

      In the read call the two halves of the paired read L and R (left and right)
      Code:
      ================================^^^^^^^^^^^^^^^=======================
      Read-L                             linker      Read-R
      After removal of the linker, splitting the reads and aligning (or assembling) they should be oriented as such and the distance between them should be ~ 20kbp:

      Code:
      Read-R                                                   Read-L
      -------->                                                -------->
      ==================================================================
      Of course if the reads match the bottom strand of the reference they will be flipped around.

      Comment


      • #4
        Originally posted by kmcarr View Post
        Your circularized DNAs should be ~ 20kbp which are then shattered into 500-800bp fragments. This means that there are far more fragments not containing the linker than those which do. The biotin binding is meant to enrich your fragment pool for the linker containing pieces, but unfortunately this sometimes the enrichment process is not very selective. This results in a lot of reads which do not contain the linker and thus are not paired end reads. I have seen very low percentages of true paired ends in some of our preps as well.



        They will be in the f-> f-> orientation but their order relative to their genomic positions will be reversed. To illustrate:

        In the read call the two halves of the paired read L and R (left and right)
        Code:
        ================================^^^^^^^^^^^^^^^=======================
        Read-L                             linker      Read-R
        After removal of the linker, splitting the reads and aligning (or assembling) they should be oriented as such and the distance between them should be ~ 20kbp:

        Code:
        Read-R                                                   Read-L
        -------->                                                -------->
        ==================================================================
        Of course if the reads match the bottom strand of the reference they will be flipped around.
        I got some Paie-end data, but i don't know the sequence of the linker and insert size. could you tell me from where i can know it. Many thanks.

        Comment


        • #5
          Have a look at this thread (http://seqanswers.com/forums/showthread.php?t=12940) for linker sequences. You will have to ask the person who constructed the library for insert size information.

          P.S. There is no reason to shout (using large, bold font) in this forum, we can read the normal typeface just fine.

          Comment


          • #6
            Thanks Kmcarr. I have read the thread, didn't find the linker sequencer. I guess maybe the internal adaptor is the same for 454 sequencing like Illumina sequencing adaptor, that's why i asked the question again.
            Maybe after doing the alignment of all paired end read, i can find it.
            P.S. This is the first time i ask the question on this web, have no idea about the word size. I
            t's not my mean to shout, it's your meaning.
            Thanks again.

            Comment


            • #7
              aurora_Jing,

              Are you asking about 454 paired end reads, Illumina paired end or Illumina mate-pair? You asked your question in a thread specifically about 454 paired end reads so naturally I assumed that was the data you were asking about. The thread I pointed you to clearly has the linker sequences for 454 paired end libraries in the first and second posts.

              Please provide more detail about what types of read data you have (sequencing platform & library construction type) so we better help you.

              Comment


              • #8
                Yes, I am now dealing with 454 Mate pair data.
                I find the linker sequences in the posts you kindly pointed. I am certainly wrong regard the thread you introduced as the thread I read yesterday.
                Thanks again for your quick and kindly reply.

                Comment


                • #9
                  What is usually the percentage of true PE reads in a 20 kb prep?
                  I´ve done several 3 kb preps but never 8 or 20 kb. I believe in our 3 kb preps we get 50.60% of true PE reads.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Current Approaches to Protein Sequencing
                    by seqadmin


                    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                    04-04-2024, 04:25 PM
                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 04-11-2024, 12:08 PM
                  0 responses
                  17 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 10:19 PM
                  0 responses
                  22 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 09:21 AM
                  0 responses
                  16 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-04-2024, 09:00 AM
                  0 responses
                  46 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X