Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merging illumina V4 paired end reads

    Hi

    I am having difficulty understanding how my merged reads are producing a certain amplicon size.

    Basically, I have 2 x 251bp reads. This 251bp includes primer sequence, of as far as I understand, 20bp.

    When these 251bp reads are merged, they produce an amplicon size of 291bp.

    Here is an example of a merged read with amplicon length of 291bp (there are two N's in the sequence as haven't run screen.seqs yet)

    >M01822_319_000000000-AG4CF_1_1101_16954_1171
    GTGCCAGCCGCCGCGGTAATACATAGGATGCAAGCGTTATCCGGATTTACTGGGCGTAAAGCGAGCGCAGGCGGATTTACAAGTCTGATGTTAAAGACAACTGCTTAACGGTTGTTTGCATTGGAAACTGTAAGTCTAGAGTATAGTAGAGAGTTTTGGAACTCCATGTGGAGCGGTGGAATGCGTAGATATATGGAAGAACACCAGAGGCGAAGGCGAAAACTTAGGCTATAACTGACGCTTAGGCTCGAAAGTGTGGGNAGCAAATAGGATTAGATACCCCGGTAGTCN

    I have looked at the make.contigs report file and it seems to report that the following (if I am understanding correctly);

    Length = 291bp
    Overlap length = 211 bp
    Total primers = 40bp

    Therefore, is the read length 251bp, but merged read length 291bp (as forward and reverse primers included)?

    What I don't understand is that each primer length is 20bp, so should the amplicon not be 271bp?

    I know I have to remove the primers, but just trying to understand this.

    Any help would be greatly appreciated.

    Thank you

  • #2
    What are your primer sequences? Are they on each end of the merged 291bp contig? From what you're saying, it sounds to me like 291 - 20 - 20 = 251...

    Relatedly, you really should only have 2x250, not 2x251. The last cycle is used for quality scoring of the previous cycle.

    Comment


    • #3
      Hi Fanli

      Thank you for your answer

      The primer seqs are as follows:

      forward
      GTGCCAGCCGCCGCGGTAA

      reverse
      GGACTACACGGGTATCTAAT

      They appear to be on each end of merged contig, but each read is 251bp including the primer (so ~231bp excluding the primer).

      The overlap, according to the report file after merging, indicates that there is 211bp of overlap.

      Which means that the only way I can make sense of this is 211+20+20=251bp read for both forward and reverse, that has assembled into a 291bp contig.

      i.e. 211+20+20 (F) +20+20 (R) = 291bp?

      What do you think?

      Thank you again

      Comment


      • #4
        I think we're in agreement - does the attached diagram help?
        Attached Files

        Comment


        • #5
          That is perfect, thank you so much for explaining Fanli. That is most helpful

          One more question - do the seqs unique to R1 and R2 not merge in this case, or are they merged regardless?

          Thank you again

          Comment


          • #6
            They are merged as well - the full 291bp sequence from your original post is what you get. Another way to think about this is that you are sequencing a 251bp amplicon with 20 bases on each end unique to R1 or R2 and 231 bases in the middle covered by both.

            Comment


            • #7
              Thank you Fanli. In this case is it however that only 211 bases are covered by both?

              Comment


              • #8
                Sorry, yes. 211 bases in the middle - math is hard :/

                Comment


                • #9
                  Thank you again for your very helpful answers

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM
                  • seqadmin
                    Techniques and Challenges in Conservation Genomics
                    by seqadmin



                    The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                    Avian Conservation
                    Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                    03-08-2024, 10:41 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Yesterday, 06:37 PM
                  0 responses
                  11 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, Yesterday, 06:07 PM
                  0 responses
                  10 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-22-2024, 10:03 AM
                  0 responses
                  51 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-21-2024, 07:32 AM
                  0 responses
                  67 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X