Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Indels quantifications with MiSeq

    Dear all,

    I have troubles for analysing CRISPR experiments with sequencing. I am using PCR to amplify the target sequence of my sgRNA (classical PCR product of 400-500 bp) and then use paired-end sequencing with the MiSeq platform with reads of 250bp. My goals are to:
    -determine the number of reads containing indels at the target site to infere the percent of edited sequences in my sample
    -determine the locations of the indels
    -annotate the variant types (synonymous,...) and determine their frequency in the pool of reads containing indels

    To do this, I have difficulties to determine the best tools to use. I am planning to apply the classical first steps of reads processing (filter based on Phred quality,...) and then to use BWA-MEM or Bowtie2 for alignment on the PCR amplicon sequence. Are these aligners suitable for such applications?

    My first idea for indels quantification was to process the BAM files to remove PCR duplicates and make it compatible with GATK HaplotypeCaller (using PICARD tool). I found GATK often used for WGS applications but is it a good tool to determine indels in PCR product sequencing? In addition, if this is optimum, should I trimm the reads around the expected zone of NHEJ before the analysis (for CRISPR, let say the target sequence +/- 10bp) or should I use the whole reads and perform indels realignments before variant calling? If GATK is not appropriate, is anyone know other more suitable tools?

    So, as you can see, it is a litle confusing for the moment... I hope my questions are clear, if not, do not hesitate to tell me.
    Thank you for your help.

    Nicolas

  • #2
    You may also wish to try out pindel and cortex. I've found both of them to be useful in addition to GATK. Unfortunately I do not have a 'single step' answer.

    Comment


    • #3
      Dear Westerman,

      Thank you for your answer. I was suspected that the answer would not be easy...
      If I understand well how pindel and cortex are working, it is doing the same kind of analysis as HaplotypeCaller. However I found pindel more efficient than GATK for longer indels. So do you think I should run both analysis in parallel and then "merge" results or just use pindel and/or cortex to have another confirmation of results obtained with GATK? Then, I had difficulties to find confirmations that these tools are compatible with indels analysis on amplicons sequencing, could you just confirm me if this is the case?

      Thank you very much for your help.
      Nicolas

      Comment


      • #4
        I would merge the three of them together. I say this because, at least for the dataset I was recently processing, I could manually look at the alignments (via IGV) and see places where one or the other missed an InDel.

        Comment


        • #5
          Thank you for your help. I will try like this and come back if I have problems.
          Nicolas

          Comment


          • #6
            If you are interested in longer indels, I suggest you map with BBMap and not do indel realignment before variant calling. Bowtie2 and bwa-mem will only find short indels.

            Comment


            • #7
              Originally posted by Brian Bushnell View Post
              If you are interested in longer indels, I suggest you map with BBMap and not do indel realignment before variant calling. Bowtie2 and bwa-mem will only find short indels.
              While I believe that bowtie2/bwa can not find long indels by themselves it is my understanding that programs such as cortex and pindel can find the long indels via looking at the bowtie/bwa mapping for indel breakpoints.

              That said, I should try BBMap on my dataset. It is a very good program that I should use more often.

              Comment


              • #8
                I found in publications that indels created by the CRISPR system are often small and centered at the cleavage site (majority are less than 10bp). Is it too large to be tolerated by Bowtie 2 of BWA-MEM? In particular, I found that we can set manually the threshold of number of indels, mismatches and gaps with Bowtie 2.
                I will try first with Bowtie 2 (to have a first idea of results) and use IGV to look manually at the indels size. I will then compare with BBMap to check if results are different between both.
                Thank you for your comments!
                Nicolas

                Comment


                • #9
                  Perhaps publications describing the length of indels created by CRISPR only describe indels under 10bp because they are detecting them using tools that can only find indels under 10bp. I am not an expert on CRISPR, but when analyzing bacterial RNA-seq data, I found numerous deletions in the several hundred bp range within or adjacent to CRISPR-related genes. They were kind of interesting in that they clearly clustered together, but the boundaries often did not perfectly agree, and I don't know exactly what they were. I don't know if that's relevant to what you're studying, though.

                  Comment


                  • #10
                    Indeed, this could be an explanation. In our case, we are editing the mammalian genome with an adapted CRISPR system (with a targeting on a specific gene). So there is no real CRISPR related genes as observed in bacteria with the presence of the repeated spacers. However, large indels at the target sites have been previously found but in rare cases. So I will do both analysis to determine the relative presence of large and short indels at our target site.
                    Thank you very much for your advices.
                    Nicolas

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Strategies for Sequencing Challenging Samples
                      by seqadmin


                      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                      03-22-2024, 06:39 AM
                    • seqadmin
                      Techniques and Challenges in Conservation Genomics
                      by seqadmin



                      The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                      Avian Conservation
                      Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                      03-08-2024, 10:41 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, Yesterday, 06:37 PM
                    0 responses
                    10 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, Yesterday, 06:07 PM
                    0 responses
                    10 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 03-22-2024, 10:03 AM
                    0 responses
                    51 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 03-21-2024, 07:32 AM
                    0 responses
                    67 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X