Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How can I identify homologous genes between these two datasets?

    Hi. Thanks for taking the time to read my question. I am a PhD student and need some help getting over a bump on a project I'm working on.

    I have a RNA-seq dataset. I aligned the reads to the reference genome with bowtie2, I have a bam file for this.

    I assembled a genome from the same reads using Trinity, then aligned the reads to the assembly using bowtie2. I have a bam file for this. I have also ordered contigs based on the reference genome using Mauve, and did some genefinding using RAST. It's not a perfect assembly by any means.

    I want to check gene expression levels between these two cases, but that means I have to identify the homologous genes. I need to be able to say, "In the first case gene A is expressed this much, and in the second case gene A is expressed that much." I just am not sure how to get there from where I'm at now. I was thinking maybe I somehow have to blast the data and parse out position values or something, but I'm not sure. I feel like people must have seen this problem before.

    I really appreciate any advice anyone can offer. Thanks very much in advance!

  • #2
    It is time to use a R package such as edgeR, DESeq2, etc
    It will do the differential expression analysis for you

    Comment


    • #3
      I would suggest a Reciprocal Best Blast Hit (RBBH) analysis as your first step in finding candidate homologues. If you have or expect to have lots of gene duplication in either species, then more sophisticated methods/analysis may be needed.

      e.g. You could use my script & Galaxy wrapper:


      See also the reference suggested in the help,

      Punta and Ofran (2008) The Rough Guide to In Silico Function Prediction, or How To Use Sequence and Structure Information To Predict Protein Function. PLoS Comput Biol 4(10): e1000160.
      Protein structure prediction,Protein structure,Protein structure comparison,Protein structure databases,Sequence motif analysis,Structural genomics,Protein domains,Sequence alignment

      Comment


      • #4
        @aprice67: If a reference genome is available what was the reason to do a trinity assembly? Were you expecting to improve on the annotation available?

        What exactly do you mean by this"
        I want to check gene expression levels between these two cases

        Comment


        • #5
          Originally posted by maubp View Post
          I would suggest a Reciprocal Best Blast Hit (RBBH) analysis as your first step in finding candidate homologues. If you have or expect to have lots of gene duplication in either species, then more sophisticated methods/analysis may be needed.

          e.g. You could use my script & Galaxy wrapper:


          See also the reference suggested in the help,

          Punta and Ofran (2008) The Rough Guide to In Silico Function Prediction, or How To Use Sequence and Structure Information To Predict Protein Function. PLoS Comput Biol 4(10): e1000160.
          http://dx.doi.org/10.1371/journal.pcbi.1000160
          @maubp: Thanks very much! I'm going to give this a try and see where it leads.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM
          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          18 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          22 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          17 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          48 views
          0 likes
          Last Post seqadmin  
          Working...
          X