Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • telling true variants from sequencing errors using the ion torrent technology

    I am trying to detect rare variants in pooled mutated samples. I have barcoded my pools and I have 3 genes I am analyzing in each of 28 pools. Since I am doing this as a test with diluted samples bearing a known mutation I can confirm if the technique is working. I can clearly see the expected mutations (around 2% change), but also see a bunch of sequencing errors, mainly at the end of the reads, which at the end show frequencies close to the real mutation. I been struggling for the last couple of days with SAMtools, CRISP and even the ion torrent variant call plugin without success (this last one only calls alleles with a minimum of 5% frequency and rare mutations are lower than that). I know there are other Biologist-friendly tools out there, but they are usually expensive, so I am also on the lookout for something free and efficient.

    I would mainly like to find a way to tell those sequencing errors from the real variants...any advise?

  • #2
    Not ion torrent specific

    This isn't an ion torrent specific problem. As the frequency of the mutation you're looking for falls near and below the sequencing error rate, things get hard. I think you need to consider an approach like this:



    where you introduce random bases into your primers such that you can tell which sequencing reads came from which first-round copies. Basically, you generate a first set of molecules from the template with a random sequence on the end, then amplify and sequence those. Now rather than 10k reads from your amplicon, you have 100 reads of 100 original molecules. The 99 original molecules without the mutation produce mostly WT bases. The 1 original molecule with the mutation produces mostly reads with the variant base.

    You may also see this technique referred to as dogtagging.

    Comment


    • #3
      thanks I will look into this

      Comment


      • #4
        A couple suggestions that might help:

        First - depending on your application, you may also want to look into a viral quasispecies analysis tool such as ShoRAH or QuRe - the viral quasispecies problem is somewhat similar to rare variant calling.

        Also - I may be wrong, but I believe the Ion Torrent Variant Caller tool is tuned for human SNP calling. In my own experience it is not the best tool for use with microbial or viral sequences. I generally don't use it.

        You may want to aggressively pre-filter your FASTQ from the torrent server before read mapping. Filter your reads to only Q20 or higher. Remove all duplicate reads (I'm assuming this is from PCR amplicon sequencing data?) Then trim the ends of all reads by 3 - 5 bases and discard any read that is 10 - 20 nt or shorter in length. You will end up throwing out a fair amount of data, but your mapping and variant calling will improve. Then use tmap3 or some other ion torrent friendly read-mapper. We use CLCbio's software for this - but there are some open source solutions out there as well.

        hope that helps.
        @bioinformer
        http://www.linkedin.com/in/jonathanjacobs

        Comment


        • #5
          Thanks Jonathan,

          The quasispecies might not apply to this study, but is still good to know since I will be working with retrotransposons in another area.

          I think you are right about the quality filtering. I also use the CLC bio and it seems that with the new quality variant caller I am able to distinguish the mutations I was expecting.

          thanks

          Leo.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM
          • seqadmin
            Techniques and Challenges in Conservation Genomics
            by seqadmin



            The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

            Avian Conservation
            Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
            03-08-2024, 10:41 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Yesterday, 06:37 PM
          0 responses
          8 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, Yesterday, 06:07 PM
          0 responses
          8 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-22-2024, 10:03 AM
          0 responses
          49 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-21-2024, 07:32 AM
          0 responses
          67 views
          0 likes
          Last Post seqadmin  
          Working...
          X