Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Looking for a few NGS-ers willing to share a bad experience about NGS data analysis

    Hi, everyone! I'm looking for a few people who would be willing to share a bad experience regarding NGS data analysis. Any takers?

    Thanks!
    -Carlton

  • #2
    I'll bite, one of the less potentially embarrassing ones:

    We're had several successful little projects doing de novo assembly of phage genomes with 454, but in one case all we got was host contaminant and what looked like human mitochondria. Moral: do more QC on the sample before sequencing. Otherwise you can waste your sequencing money & some analysis time.

    Semi-anonymous user names may discourage posts though. I'm sure people here could share horror stories of colleagues coming to them with "We've just done some sequencing, could you assemble it for us please" with no idea of the scale of the problem nor how much analysis time they should have budgeted for. Probably the best warnings would be saved for off the record conversations at the pub/bar at conferences!

    Comment


    • #3
      Another one for you (not first hand): We updated tool X and repeated the analysis and now all the results have changed almost beyond recognition. I can think of some threads here along those lines discussing differential gene expression from RNA-Seq data.

      e.g. http://seqanswers.com/forums/showthread.php?t=15896

      Edit: To make my point more explicit (thanks Simon), the point is you should be diligent in your record keeping (electronic lab book or whatever works for you) and include the version number of key packages and datasets/databases since this can sometimes make a surprising difference to the results. This goes beyond high throughput sequencing, and applies to Bioinformatics as a whole.
      Last edited by maubp; 12-08-2011, 02:32 AM.

      Comment


      • #4
        My favorite one of all time:
        High-throughput sequencing technologies promise to transform the fields of genetics and comparative biology by delivering tens of thousands of genomes in the near future. Although it is feasible to construct de novo genome assemblies in a few months, there has been relatively little attention to wha …


        Check out supplementary table 1

        Comment


        • #5
          Originally posted by maubp View Post
          Another one for you (not first hand): We updated tool X and repeated the analysis and now all the results have changed almost beyond recognition. I can think of some threads here along those lines discussing differential gene expression from RNA-Seq data.
          To try to make a wider point - this is why we advocate getting our users to visualise and explore their data. Running a tool, however good it may be, tends to make people too trusting in the results produced. If you can actually view those results in a number of different ways then you get a much better feel for how much confidence they can have in the hits they see.

          For example - you might find that changing an analysis threshold by a small amount can hugely change the number of hits you get, but if you can see a scatterplot of your data with the threshold you're using on the edge of a huge cloud of points then you can see exactly why this happens.

          Comment


          • #6
            Originally posted by simonandrews View Post
            For example - you might find that changing an analysis threshold by a small amount can hugely change the number of hits you get, but if you can see a scatterplot of your data with the threshold you're using on the edge of a huge cloud of points then you can see exactly why this happens.
            Excellent advice. Another related point is to avoid pre-determined e-values as thresholds when they will alter radically based on things like dataset size (e.g. BLAST matches - whereas the bitscore is stable). i.e. A discriminatory e-value for one dataset can be quite inappropriate on another.

            Comment


            • #7
              Originally posted by genericforms View Post
              My favorite one of all time:
              High-throughput sequencing technologies promise to transform the fields of genetics and comparative biology by delivering tens of thousands of genomes in the near future. Although it is feasible to construct de novo genome assemblies in a few months, there has been relatively little attention to wha …


              Check out supplementary table 1
              Buccal swab?

              Comment


              • #8
                Originally posted by polyatail View Post
                Buccal swab?
                LOL! Must have been!

                Comment


                • #9
                  Originally posted by maubp View Post
                  Excellent advice. Another related point is to avoid pre-determined e-values as thresholds when they will alter radically based on things like dataset size (e.g. BLAST matches - whereas the bitscore is stable). i.e. A discriminatory e-value for one dataset can be quite inappropriate on another.
                  As if to prove a point, I saw this tweet this morning.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Techniques and Challenges in Conservation Genomics
                    by seqadmin



                    The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                    Avian Conservation
                    Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                    03-08-2024, 10:41 AM
                  • seqadmin
                    The Impact of AI in Genomic Medicine
                    by seqadmin



                    Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
                    02-26-2024, 02:07 PM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 03-14-2024, 06:13 AM
                  0 responses
                  32 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-08-2024, 08:03 AM
                  0 responses
                  72 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-07-2024, 08:13 AM
                  0 responses
                  80 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-06-2024, 09:51 AM
                  0 responses
                  68 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X