Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Aligning/Mapping Illumina reads to reference in Geneious problem

    I am about to give up on this because Geneious just doesn't seem to give me what I am looking for and all the other softwares seem to have a steep learning curve.

    I have 4 samples. Each sample has a varying amount of DNA from a pure culture of a known bacteria. These samples were ran through a MiSeq and I have all my read data. Here is the problem. When I map the paired-end reads to the 16S rRNA gene of say a Vibrio I get >99% of the sequences are Vibrio even though I know the amount of Vibrio DNA was only 1/3 of the pooled sample. When I do the same map to reference on another bacteria I get similar results, such that Geneious will say >99% of the sequences are Pseudomonas, even though only a 1/3 of the sample DNA came from that Genus. How can this be possible?

    I am using medium sensitivity. Also just as a test the 16S rRNA gene alignment for both bacteria only showed about 78% similarity. What is going on here? Should I use other parameters or maybe switch to another program?

  • #2
    how similar are these species? How much variation do you anticipate in the 16s region? Standard alignment tools may not be the best approach for this type of project.

    Comment


    • #3
      Originally posted by snetmcom View Post
      how similar are these species? How much variation do you anticipate in the 16s region? Standard alignment tools may not be the best approach for this type of project.
      Well based on just the 16S rRNA gene approximately 78%, which is what I expected, but that is just a small part of their genome. Do you have any other suggestions in terms of tools?

      Comment


      • #4
        Yeah you shouldn't be seeing that, Vibrio and Pseudomonas are from different phyla.

        If you paid for Geneious, I would definitely use their support. I did the free trial it was good but I couldn't justify the price - would rather spend the money on other things, like sequencing.

        If you want a user friendly free tool you might try Galaxy.

        Comment


        • #5
          Originally posted by Illusive Man View Post
          I am about to give up on this because Geneious just doesn't seem to give me what I am looking for and all the other softwares seem to have a steep learning curve.

          I have 4 samples. Each sample has a varying amount of DNA from a pure culture of a known bacteria. These samples were ran through a MiSeq and I have all my read data. Here is the problem. When I map the paired-end reads to the 16S rRNA gene of say a Vibrio I get >99% of the sequences are Vibrio even though I know the amount of Vibrio DNA was only 1/3 of the pooled sample. When I do the same map to reference on another bacteria I get similar results, such that Geneious will say >99% of the sequences are Pseudomonas, even though only a 1/3 of the sample DNA came from that Genus. How can this be possible?

          I am using medium sensitivity. Also just as a test the 16S rRNA gene alignment for both bacteria only showed about 78% similarity. What is going on here? Should I use other parameters or maybe switch to another program?
          Did you sequence total DNA or just 16S? The fact that >99% of your reads map to a 16S reference would suggest the latter. In this case, which region of the gene did you sequence? Also, since 16S is highly conserved, is it really that surprising that basically all your reads map to some bacterial 16S reference, especially considering that you're using "medium sensitivity"? What does it mean anyway? How similar does the read have to be in order to map with "medium sensitivity" setting? 50%? 75%?
          savetherhino.org

          Comment


          • #6
            Originally posted by Illusive Man View Post
            I am about to give up on this because Geneious just doesn't seem to give me what I am looking for and all the other softwares seem to have a steep learning curve.

            I have 4 samples. Each sample has a varying amount of DNA from a pure culture of a known bacteria. These samples were ran through a MiSeq and I have all my read data. Here is the problem. When I map the paired-end reads to the 16S rRNA gene of say a Vibrio I get >99% of the sequences are Vibrio even though I know the amount of Vibrio DNA was only 1/3 of the pooled sample. When I do the same map to reference on another bacteria I get similar results, such that Geneious will say >99% of the sequences are Pseudomonas, even though only a 1/3 of the sample DNA came from that Genus. How can this be possible?

            I am using medium sensitivity. Also just as a test the 16S rRNA gene alignment for both bacteria only showed about 78% similarity. What is going on here? Should I use other parameters or maybe switch to another program?
            For starters, it looks like you're doing amplicon sequencing and not whole genome. If that's the case, then it makes perfect sense that a read aligner would map all of your reads to any given 16S, because the 16S gene is so homogenous between spp. compared to protein encoding genes. Geneious especially tries really hard to map as many reads as possible to the reference, because it assumes that they should, and if you're just taking blanket values then you're going to see results like this. There are advanced options in Geneious that you can use to change this behaviour, but that still isn't what you probably should be doing.

            So, what should you do? As I said, if you really are doing amplicon sequencing, then use a package such as Qiime or Mothur for you analyses. First, they're designed for 16S amplicons, and secondly what you were doing would be immediately rejected by reviewers if you tried to publish it (or at least I'd reject it based on your description so far).

            Also, bioinformatics is very difficult, and even more so to do properly. My best advice would be to read the literature to see how other people are doing what you want to do, and then delve into the manuals/tutorials for those software packages so you know what they do, how they do it, and why you get the results that you do. If you don't put forth that effort, not only will you not be able to tell if you're getting the correct results, but you'll never be able to figure out on your own what might have gone wrong.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM
            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            18 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            22 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            16 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            47 views
            0 likes
            Last Post seqadmin  
            Working...
            X