Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Differential transcript expression between different varieties of the same species

    Dear all,

    we have sequenced the transcriptome of different varieties of the same species (lacking complete genome information) using different read types, and obtained contigs with a respectable N50 for each of the varieties.
    Now we want to use the reads to perform a differential gene transcript expression analysis, and two are the possible strategies that I could think of.

    1) Map the variety reads on the variety contigs, and then group the contigs via some homology procedure. Then perform the analysis by comparing the counts within these groups.
    Issue 1: the same transcript has sometimes different levels of fragmentation across the assemblies (like three fragments here, full sequence there), making a direct 1-1 comparison inappropriate.
    Issue 2: robust orthology assignment methods (OrthoMCL, inParanoid etc.) are tuned and work principally (as far as I know) on proteins.
    Issue 3: all differential expression tools that I know (e.g. EdgeR) assume identical lengths for the contigs targeted by the read match counts.

    2) An alternative is to do a whole assembly using all varieties, and then map each variety reads separately on these contigs, thereby solving all the previous issues. However, it sounds dirty, and the joint assembly is very fragmented compared to the variety-specific ones.

    How would you tackle a case like this? Would you favour one approach or the other? Possibly I'm missing some major strategy (and perhaps I'm duplicating another post on the issue), but forgive me, I'm a fresher

    Thank you!

    Federico

  • #2
    Dear Federico,

    in my experience I always adopted the second solution. Maybe you can increase the completeness of the reference mapping transcriptome with some public ESTs if they are available for your species.

    Comment


    • #3
      It's much easier to work with a single transcriptome for DE comparisons.

      In theory, a combined assembly sounds like an easy way to get that - unfortunately it's not easy to prevent differences from fragmenting the assembly using typical de bruijn assemblers.

      Perhaps it might be possible to merge the assemblies somehow, combining very similar transcripts into one.

      Comment


      • #4
        There seem to be two strategies for a pooled assembly:
        1) pool all transcriptome reads from all varieties, then assemble
        2) pool only consensus contig sequences from your variety-specific assemblies and assemble those
        Not sure which option you used already but if it was (1), you might see better results with strategy (2).

        Comment


        • #5
          Dear all, thanks for your replies!

          I adopted (my) second solution. In this way I got precious information on e.g. variety-specific genes. Since Differential Expression ANalsyis works well even in cases many vs. zero (I'm using DESeq) I encountered no apparent problems. However, I still haven't ruled out the problem arising from reads aligning to nearly-identical contigs, which are discarded by my pipeline for having two identical hits in different contigs. In this respect, the approach suggested by Tony and by greigite(2, i.e. merging the contigs after variety-specific assembly) would be optimal.
          But in this case, I would do an all_vs_all alignment of the contigs, group them into high similarity clusters, multialign them, merge them with something like consambig, and then use them as merged contigs. However like this the variety-specific information would be partially lost, as SNPs for example would be completely ignored.

          So everything considered even a merged consensus contig solution does not seem to be the best one...

          Comment


          • #6
            Hi, you might be completely somewhere else, after nearly 3 years, but I'm really curious about how this story ended... I'm facing the same kind of problem and I would like to know what was your best strategy at the end, and if it worked well.

            Comment


            • #7
              Originally posted by Birdman View Post
              Hi, you might be completely somewhere else, after nearly 3 years, but I'm really curious about how this story ended... I'm facing the same kind of problem and I would like to know what was your best strategy at the end, and if it worked well.
              Ah, we ended up using the second approach and then validate selected differential expressions. Classic. 80% of our findings had a significant match with RT-PCR results.

              Comment


              • #8
                80% match with RT-PCR is great! So in summary, you merged your assemblies together and then aligned your samples separately to this super-assembly? How did you merge the similar contigs whithin this super-assembly? CD-Hit-EST or something else?

                Comment


                • #9
                  Dear giorgifm,
                  did you already publish a paper describing your method? I would like to have a closer look how you solved this issue. Kind regards

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Current Approaches to Protein Sequencing
                    by seqadmin


                    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                    04-04-2024, 04:25 PM
                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 04-11-2024, 12:08 PM
                  0 responses
                  17 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 10:19 PM
                  0 responses
                  22 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 09:21 AM
                  0 responses
                  16 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-04-2024, 09:00 AM
                  0 responses
                  46 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X