Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Differential transcript expression between different varieties of the same species

    Dear all,

    we have sequenced the transcriptome of different varieties of the same species (lacking complete genome information) using different read types, and obtained contigs with a respectable N50 for each of the varieties.
    Now we want to use the reads to perform a differential gene transcript expression analysis, and two are the possible strategies that I could think of.

    1) Map the variety reads on the variety contigs, and then group the contigs via some homology procedure. Then perform the analysis by comparing the counts within these groups.
    Issue 1: the same transcript has sometimes different levels of fragmentation across the assemblies (like three fragments here, full sequence there), making a direct 1-1 comparison inappropriate.
    Issue 2: robust orthology assignment methods (OrthoMCL, inParanoid etc.) are tuned and work principally (as far as I know) on proteins.
    Issue 3: all differential expression tools that I know (e.g. EdgeR) assume identical lengths for the contigs targeted by the read match counts.

    2) An alternative is to do a whole assembly using all varieties, and then map each variety reads separately on these contigs, thereby solving all the previous issues. However, it sounds dirty, and the joint assembly is very fragmented compared to the variety-specific ones.

    How would you tackle a case like this? Would you favour one approach or the other? Possibly I'm missing some major strategy (and perhaps I'm duplicating another post on the issue), but forgive me, I'm a fresher

    Thank you!

    Federico

  • #2
    Dear Federico,

    in my experience I always adopted the second solution. Maybe you can increase the completeness of the reference mapping transcriptome with some public ESTs if they are available for your species.

    Comment


    • #3
      It's much easier to work with a single transcriptome for DE comparisons.

      In theory, a combined assembly sounds like an easy way to get that - unfortunately it's not easy to prevent differences from fragmenting the assembly using typical de bruijn assemblers.

      Perhaps it might be possible to merge the assemblies somehow, combining very similar transcripts into one.

      Comment


      • #4
        There seem to be two strategies for a pooled assembly:
        1) pool all transcriptome reads from all varieties, then assemble
        2) pool only consensus contig sequences from your variety-specific assemblies and assemble those
        Not sure which option you used already but if it was (1), you might see better results with strategy (2).

        Comment


        • #5
          Dear all, thanks for your replies!

          I adopted (my) second solution. In this way I got precious information on e.g. variety-specific genes. Since Differential Expression ANalsyis works well even in cases many vs. zero (I'm using DESeq) I encountered no apparent problems. However, I still haven't ruled out the problem arising from reads aligning to nearly-identical contigs, which are discarded by my pipeline for having two identical hits in different contigs. In this respect, the approach suggested by Tony and by greigite(2, i.e. merging the contigs after variety-specific assembly) would be optimal.
          But in this case, I would do an all_vs_all alignment of the contigs, group them into high similarity clusters, multialign them, merge them with something like consambig, and then use them as merged contigs. However like this the variety-specific information would be partially lost, as SNPs for example would be completely ignored.

          So everything considered even a merged consensus contig solution does not seem to be the best one...

          Comment


          • #6
            Hi, you might be completely somewhere else, after nearly 3 years, but I'm really curious about how this story ended... I'm facing the same kind of problem and I would like to know what was your best strategy at the end, and if it worked well.

            Comment


            • #7
              Originally posted by Birdman View Post
              Hi, you might be completely somewhere else, after nearly 3 years, but I'm really curious about how this story ended... I'm facing the same kind of problem and I would like to know what was your best strategy at the end, and if it worked well.
              Ah, we ended up using the second approach and then validate selected differential expressions. Classic. 80% of our findings had a significant match with RT-PCR results.

              Comment


              • #8
                80% match with RT-PCR is great! So in summary, you merged your assemblies together and then aligned your samples separately to this super-assembly? How did you merge the similar contigs whithin this super-assembly? CD-Hit-EST or something else?

                Comment


                • #9
                  Dear giorgifm,
                  did you already publish a paper describing your method? I would like to have a closer look how you solved this issue. Kind regards

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM
                  • seqadmin
                    Techniques and Challenges in Conservation Genomics
                    by seqadmin



                    The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                    Avian Conservation
                    Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                    03-08-2024, 10:41 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Yesterday, 06:37 PM
                  0 responses
                  10 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, Yesterday, 06:07 PM
                  0 responses
                  9 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-22-2024, 10:03 AM
                  0 responses
                  50 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-21-2024, 07:32 AM
                  0 responses
                  67 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X