Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Originally posted by vebaev View Post
    no, each 2 are different plant, 1 control and 1 treatment , no info of transcriptome size (probably only one of them will be sequenced with ref genome - tomato)
    1) If you want to find differentially expressed genes....you need more than 1 replicate per condition.

    2) If the reference genome is Tomato, which has had its genome sequenced and at least a version one annotation, then you should be able to estimate the transcriptome size. Unless I have misunderstood and this is not Tomato.

    Comment


    • #32
      yes one of the plants is tomato but other samples we have plants with no ref genome/annotation
      ------------
      SMART - bioinfo.uni-plovdiv.bg

      Comment


      • #33
        And what do you plan to do with these other samples? Have you guys given any thought to how you will analyze the data? Will attempt to construct a transcriptome denovo? Or are these other species closely related to Tomato or another sequenced genome where you could map to that and obtain reasonable data?

        It doesn't make sense to blindly sequence without thinking the experiment through and knowing what you need up front. Its also very hard for people to give advice when you are very vague about the details of your experiment and end goals.

        Comment


        • #34
          Hi Y'all,

          this thread seems pretty appropriate for my question:

          I would like to compare rates of synonymous and nonsynonymous substitutions among sequences of orthologous genes from ~ 8 species of fish in the family Embiotocidae (closest related genome sequenced would be one of the cichlids) to test for positive Darwinian selection (genome size is ~ 1 GB). I am relatively poor and would love to "goldilocks" it on my first run. That is, how many fishes can I multiplex into a single perfect firmness bed, err I mean Illumina highseq lane and still get an acceptable amount of coverage to call polymorphisms between orthologs.

          thanks very much!

          Comment


          • #35
            Unfortunately calling SNPs requires quite a bit more coverage than the typical gene expression experiment. The reason being that you'll want to put some type of minimum count level requirement on the candidate sites similar to how you would put a min read count limit on genes in a differential expression test. In a DE test we might cut out genes with less than 10 hits or maybe even those that have less than 1 hit per million mapped reads (edgeR advice). If you consider potential aligner error there's possibly a 5 to 10% chance any read is incorrectly aligned (depending on your aligner) and with the high variance of low count levels you're going to want to apply that minimum of 10 hits at every candidate site. Of course there's sequencing error on top of that and god knows how much other noise in the data. Maybe you can see where I'm going with this - but for gene expression which is typically dealing with features 1000's of kb in length we might accept 10 hits as a cutoff which is clearly well below having a lot of coverage of a gene. In my experience 20M reads typically yields ~14,000 testable genes in the mouse genome at such a cutoff.

            For SNPs you want that kind of depth at the single base resolution. So what's going to happen, if you only have 20 million reads or so, is you're going to limit your results to a small percent of expressed genes. That's not to say you won't have some results - but they may be limited to higher expressed features. Since nobody can prove that low expressed genes don't matter that's gotta make you at least a little uncomfortable.
            /* Shawn Driscoll, Gene Expression Laboratory, Pfaff
            Salk Institute for Biological Studies, La Jolla, CA, USA */

            Comment


            • #36
              Originally posted by Glongo View Post
              Hi Y'all,

              this thread seems pretty appropriate for my question:

              I would like to compare rates of synonymous and nonsynonymous substitutions among sequences of orthologous genes from ~ 8 species of fish in the family Embiotocidae (closest related genome sequenced would be one of the cichlids) to test for positive Darwinian selection (genome size is ~ 1 GB). I am relatively poor and would love to "goldilocks" it on my first run. That is, how many fishes can I multiplex into a single perfect firmness bed, err I mean Illumina highseq lane and still get an acceptable amount of coverage to call polymorphisms between orthologs.

              thanks very much!
              Goldilocks tried all 3 beds before finding one of a firmness she was satisfied. And your situation is insanely more complex than Goldilock's.

              Questions of this sort are nearly impossible to answer with any accuracy. You do understand that any organism/tissue will have vastly variable ranges of expression diversity? Best case, you only need 10 genes for your study and those 10 genes happen to be 90% of the expression in the organism/tissue you happen to choose. I won't go worse case, because that would involve something like you falling into the tank with your fish and being eaten alive.

              Can you get by with 6 samples (including reps)? If so try that in one lane and see how it goes. Oh, don't discount the informatics chops it takes to process 1 lane of Illumina sequence (30-50 billion bases of sequence if you chose the 2x100 PE option).

              Shouldn't you do something like RADseq for an experiment of this nature rather than RNAseq?

              --
              Phillip

              Comment


              • #37
                I was so inspired by your ideas
                "My take is that if you are so deep into a process that you have lost track of what it is you are ultimately trying to achieve, you need to step back and re-think what it is you are doing and why you are doing it."

                Thank you so much!
                --
                huilili

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin


                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM
                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                30 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                32 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 09:21 AM
                0 responses
                28 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-04-2024, 09:00 AM
                0 responses
                53 views
                0 likes
                Last Post seqadmin  
                Working...
                X