Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • mhadidi2002
    Member
    • Jun 2011
    • 24

    RNA-Seq annotation

    Hello,

    I need to know is there any software that can take input of Reference genome and RNA-Seq reads and output genomic annotation from these RNA reads?

    Thanks
  • dpryan
    Devon Ryan
    • Jul 2011
    • 3478

    #2
    By that, do you mean is there a program that will align RNA-seq reads against a reference genome (returning the aligned position and any mismatches), or do you want to add a particular annotation (presumably associated gene or transcript) to already aligned reads? I assume that what you want is the former, in which case you can use tophat/bowtie, novoalign, bwa, etc.

    Comment

    • jjohnson
      Member
      • Aug 2009
      • 20

      #3
      If you are new to NGS analysis, I could also recommend using a commercial platform such as CLC or DNANexus for RNA-Seq analysis. CLC is a local workbench with yearly license fees (very reasonable for academic or non-profits), while Nexus is cloud based and a pay per GB model.

      Service providers are also an option as well to get consulting work done.

      If you have the informatics chops, DESEQ, tuxedo suite, and several other packages align and help you annotate and evaluate expression data.
      Justin H. Johnson | Twitter: @BioInfo | LinkedIn: http://bit.ly/LIJHJ | EdgeBio

      Comment

      • mhadidi2002
        Member
        • Jun 2011
        • 24

        #4
        Originally posted by dpryan View Post
        By that, do you mean is there a program that will align RNA-seq reads against a reference genome (returning the aligned position and any mismatches), or do you want to add a particular annotation (presumably associated gene or transcript) to already aligned reads? I assume that what you want is the former, in which case you can use tophat/bowtie, novoalign, bwa, etc.
        Thanks Dpryan for the reply..
        Actually I have the reads already aligned, I need to add annotations to the aligned reads that will include: ORFs, splice variants, SNPs,... and how these all will affect the protein.

        That's it..

        Thanks

        Comment

        • mhadidi2002
          Member
          • Jun 2011
          • 24

          #5
          Originally posted by jjohnson View Post
          If you are new to NGS analysis, I could also recommend using a commercial platform such as CLC or DNANexus for RNA-Seq analysis. CLC is a local workbench with yearly license fees (very reasonable for academic or non-profits), while Nexus is cloud based and a pay per GB model.

          Service providers are also an option as well to get consulting work done.

          If you have the informatics chops, DESEQ, tuxedo suite, and several other packages align and help you annotate and evaluate expression data.
          Hi jjohnson,

          Thanks for the reply.
          can these software give annotation to aligned reads? The information that I need is ORFs, SNPs, splice varients, ..etc

          Thanks again

          Comment

          • Simon Anders
            Senior Member
            • Feb 2010
            • 995

            #6
            So, you want to know for every single one of your millions of read whether it sits on a SNP, an ORF, a splice variant etc, giving you a huge list with millions of lines. Are you sure you know whet you want to do next with that? There is a reason why this is not how it is usually done.

            Comment

            • mhadidi2002
              Member
              • Jun 2011
              • 24

              #7
              Originally posted by Simon Anders View Post
              So, you want to know for every single one of your millions of read whether it sits on a SNP, an ORF, a splice variant etc, giving you a huge list with millions of lines. Are you sure you know whet you want to do next with that? There is a reason why this is not how it is usually done.
              Simon, thanks for joining the conversation.

              the main aim is to find differences among 4 species, each one has its own RNA reads aligned to the genome. these 4 species suffer different stress, I need to know the effect of these stresses on the expression. is that the right way? as I assume, by knowing from which gene this RNA was expressed, I will figure out which genes are involved in tolerating the stress.

              Thanks

              Comment

              • dpryan
                Devon Ryan
                • Jul 2011
                • 3478

                #8
                Sounds like what you actually want to do is look at differential expression between the groups. Search the forum for a likely plethora of threads on that subject.

                Comment

                • chadn737
                  Senior Member
                  • Jan 2009
                  • 392

                  #9
                  Originally posted by mhadidi2002 View Post
                  Simon, thanks for joining the conversation.

                  the main aim is to find differences among 4 species, each one has its own RNA reads aligned to the genome. these 4 species suffer different stress, I need to know the effect of these stresses on the expression. is that the right way? as I assume, by knowing from which gene this RNA was expressed, I will figure out which genes are involved in tolerating the stress.

                  Thanks
                  Could you give some more details about the species and how you did the alignments? You say you aligned them to the genome. Do all four have a sequenced genome or was there just a single genome you aligned them to? Also do you know if there are annotation files for this/these genomes, like a gff or gtf file?

                  I have done a similar experiment with 3 different plant species, for which we did not have an annotated genome. To get at the question of differential expression, we aligned to the sequenced genome of a closely related species and then used that as our reference.

                  Comment

                  • mhadidi2002
                    Member
                    • Jun 2011
                    • 24

                    #10
                    Originally posted by chadn737 View Post
                    Could you give some more details about the species and how you did the alignments? You say you aligned them to the genome. Do all four have a sequenced genome or was there just a single genome you aligned them to? Also do you know if there are annotation files for this/these genomes, like a gff or gtf file?

                    I have done a similar experiment with 3 different plant species, for which we did not have an annotated genome. To get at the question of differential expression, we aligned to the sequenced genome of a closely related species and then used that as our reference.
                    Hello Chadn737,

                    My work is similar to yours. I have 4 species for the same plant, their genomes aren't sequenced, but there is a sequence for 1 closely related plant, which is considered as the reference.

                    I aligned the 4 different species which the closely related species. the output is in bed and bam files. I need to make a reannonation for that reference genome, depending on these 4 species. I have annotation for the reference genome, but in GFF format.

                    do u any idea?

                    Thanks for the discussion.

                    Comment

                    • chadn737
                      Senior Member
                      • Jan 2009
                      • 392

                      #11
                      Originally posted by mhadidi2002 View Post
                      Hello Chadn737,

                      My work is similar to yours. I have 4 species for the same plant, their genomes aren't sequenced, but there is a sequence for 1 closely related plant, which is considered as the reference.

                      I aligned the 4 different species which the closely related species. the output is in bed and bam files. I need to make a reannonation for that reference genome, depending on these 4 species. I have annotation for the reference genome, but in GFF format.

                      do u any idea?

                      Thanks for the discussion.
                      This is exactly the situation we have worked with.

                      You're problem of "annotation" is fairly straight forward. What you are really wanting to do is get a list of gene names and the number of reads mapping to each for downstream analysis like differential expression, right?

                      There are a couple of options to do this, the approach I prefer is to use htseq-count which was written by Simon Anders. But there are other approaches using bedtools and the like.

                      If you don't mind me asking, what genome did you align to? If it is Arabidopsis, Maize, or another well annotated plant species, then there will be a lot more tools available for downstream analysis once you find your differentially expressed genes.

                      Comment

                      Latest Articles

                      Collapse

                      • SEQadmin2
                        From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                        by SEQadmin2


                        Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                        The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                        ...
                        06-02-2026, 10:05 AM
                      • SEQadmin2
                        Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                        by SEQadmin2


                        With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                        Introduction

                        Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                        05-22-2026, 06:42 AM
                      • SEQadmin2
                        Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                        by SEQadmin2

                        Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                        Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                        05-06-2026, 09:04 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by SEQadmin2, 06-02-2026, 12:03 PM
                      0 responses
                      19 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 06-02-2026, 11:40 AM
                      0 responses
                      14 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 05-28-2026, 11:40 AM
                      0 responses
                      29 views
                      0 reactions
                      Last Post SEQadmin2  
                      Started by SEQadmin2, 05-26-2026, 10:12 AM
                      0 responses
                      31 views
                      0 reactions
                      Last Post SEQadmin2  
                      Working...