Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Why RAD reads depth vary extremely?

    I have got a RAD data set and assessed it. It seems the depth of RAD fragment vary differently, extremely different.

    the reads(extract same reads) with depth 1 occupied 40%
    but there are also others different reads have the depth from 2~100000.

    I cannot image that a single reads have 100000 fold, is it a repeat in the genome? but are there so many repeat in genome?

  • #2
    Hi Litc,

    Yes, RAD fragments have extremely high coverage. It is kind of expected. Please check RAD -related publications to understand the methodology.

    Best regards,
    Douglas

    Comment


    • #3
      Yes, RAD data varies greatly in read depth, above and beyond what is seen in genomic libraries. There are two main causes.

      Firstly, the high coverage is usually because duplicates and repeats do create very large stacks. Because RAD loci are anchored to particular positions, and the loci cover many fewer positions than is typical for a genomic DNA library, these depths can be much higher than for genomic libraries, where coverage is smeared over many consecutive positions.

      Secondly, the low coverage is because there is a strong correlation between restriction fragment length and read depth, for restriction fragments below 10 Kb in size. The shorter the restriction fragment, the less coverage. We believe this is due to incomplete shearing. Short fragments are not sheared as efficiently as long ones, and so many fragments in the range 1 Kb - 10 Kb are not cut and are filtered out when you size select Illumina-length fragments (300-700bp or so). This makes it difficult to separate real RAD loci from sequencing error at the low end.

      We have a paper in revision with Molecular Ecology right now discussing this issue; I hope it will be out within the next few months.

      John Davey

      Comment


      • #4
        Hi John,

        Thank you for elaborating on this. Please do inform us when your paper is out.

        Best regards,
        Douglas

        Comment


        • #5
          Thank DZhang, and also thank johnomics's explanation. johnomics's explanations are good, but I also doubt that repeat fragment in genome is the only reason for the very high depth of some RAD. some RAD reads are very over-represented, they have a depth of 100000, is it means that RAD_A(suppose RAD_A have a depth of 100000) are 20000 times more repeat than RAD_B(have a depth of 5)? Are there something wrong in amply process or processes in RAD library preparation for some fragment are likely to be amplied while other fragments not.

          Comment


          • #6
            There could be amplification bias; RAD suffers from the same PCR bias that normal genomic libraries suffer from (see Mike Quail & Sanger Institute papers for more on this), so will tend to amplify GC rich regions more often than AT rich. But which genome are you working with? It's quite possible for a repeat to occur tens of thousands of times; there are hundreds of thousands of LINEs and SINEs in the human genome, for example.

            Comment


            • #7
              Thank johnomics, I'm working with peanut, It may be the bias and repeat the influence the depth of RAD.

              Comment


              • #8
                Molecular Ecology paper mentioned above now out in Early View, open access:

                Davey JW, Cezard T, Fuentes-Utrilla P, Eland C, Gharbi K, Blaxter ML (2012). Special features of RAD Sequencing data: implications for genotyping. Molecular Ecology, Genotyping by Sequencing special issue, Early View, doi:10.1111/mec.12084

                Comment


                • #9
                  does double digest RAD not remedy the uneven coverage issue?

                  Comment


                  • #10
                    It should do, but we haven't done any direct tests. Would be very happy to hear from others with ddRAD data.

                    Comment


                    • #11
                      We should have a data set within a couple of weeks

                      Comment


                      • #12
                        Hi all,

                        We are inspired by Peterson et al. paper regarding ddRAD and are going to try out their protocol. Does anyone have experience with this? I guess you have JackieBadger? Do you have any data yet? How many samples did you pool? And did you use the HiSeq or MiSeq?

                        Best,
                        Hanne

                        Comment


                        • #13
                          Originally posted by Hanne View Post
                          Hi all,

                          We are inspired by Peterson et al. paper regarding ddRAD and are going to try out their protocol. Does anyone have experience with this? I guess you have JackieBadger? Do you have any data yet? How many samples did you pool? And did you use the HiSeq or MiSeq?

                          Best,
                          Hanne
                          We have done a few runs of ddRAD, but still fine-tuning.
                          Number of samples depends on your genome size/number of expected RE cuts/required depth. This is something you have to really play with/estimate on a case by case basis. We are using the MiSeq, in the last run we pooled ~ 40 individuals with human sized genomes in one run ...equivalent to one illumina lane.... I think this is too many samples.
                          When I have more concrete info I will update

                          Comment


                          • #14
                            Originally posted by JackieBadger View Post
                            We have done a few runs of ddRAD, but still fine-tuning.
                            Number of samples depends on your genome size/number of expected RE cuts/required depth. This is something you have to really play with/estimate on a case by case basis. We are using the MiSeq, in the last run we pooled ~ 40 individuals with human sized genomes in one run ...equivalent to one illumina lane.... I think this is too many samples.
                            When I have more concrete info I will update
                            Thanks

                            I have a lot of questions regarding the protocol;
                            1. What software did you use to perform in silico digestions?
                            2. In the ligation molarity calculator, row 3. It looks like they only use 50 ng (0,05 ug) DNA in the reaction, is this correct?
                            3. I’m confused by the procedure described in Dynabeads® M-270 Streptadivin and talked to the producer (DYNAL). They informed me to use at least 100 ul of the beads, how much did you use?
                            4. They recommend the step 2.1.4 “Immobilization of Nucleic Acids”, with 3 washes and resuspension of beads in elution buffer. 100 ul (?) in TE buffer? Or water?
                            5. I have learned that the Streptavidin-Coupled Dynabeads® can be used directly in a PCR reaction, is this what you did?
                            6. DYNAL also told me that it is possible to Release Immobilized Biotinylated Molecules; Separation of two DNA strands can be done by either alkali or temperature treatment. Using alkali, you elute off the non-biotinylated strand with 0.1M NaOH.” Have you tried this?
                            7. Can you please share the PCR program you used? The PCR primers have annealing temperature at 72 °C…
                            8. In your protocol it says; For each Pippin Prep elution or gel extraction, set up 4-8 PCR reactions in 20ul total volume: For each PCR, combine ~20ng of size-selected sample. Do you measure this, or just split the sample, with or without beads, into 4-8 reactions?
                            9. The Norwegian Sequencing Center (http://www.sequencing.uio.no/) informed us that monochromatic motives (ie AATTC entire flow cell) can be difficult to handle for the MiSeq., so if this is the case, there will have to be adjusted slightly in the run parameters, so the failure to detect this motif (AATTC). However, get round this by blending the library with PhIX to 50%, and lowering the clustering density to 50% of normal. The result of course is that we only get 25% of the possible output of the MiSeq… What did you do?

                            Sorry for all my questions...

                            Comment


                            • #15
                              Sorry I can't answer any of your questions. Our lab tech does all of the library prep..I sit at a computer all day
                              In term of MiSeq having issues with low complexity libraries. This shouldn't be a problem for RAD work because low complexity isnt an issue. Just make sure you have a wise choice of barcodes so that cluster generation is not hindered.

                              Comment

                              Latest Articles

                              Collapse

                              • seqadmin
                                Current Approaches to Protein Sequencing
                                by seqadmin


                                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                04-04-2024, 04:25 PM
                              • seqadmin
                                Strategies for Sequencing Challenging Samples
                                by seqadmin


                                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                03-22-2024, 06:39 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by seqadmin, 04-11-2024, 12:08 PM
                              0 responses
                              18 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 10:19 PM
                              0 responses
                              22 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-10-2024, 09:21 AM
                              0 responses
                              16 views
                              0 likes
                              Last Post seqadmin  
                              Started by seqadmin, 04-04-2024, 09:00 AM
                              0 responses
                              47 views
                              0 likes
                              Last Post seqadmin  
                              Working...
                              X