Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Hi Azazel, you might want to check out tools from the Salzberg lab, in particular cufflinks, cuffdiff and cuffcompare:



    I'm new to RNASeq data analysis as well, but those three tools do the kind of thing you seem to have in mind.

    Comment


    • #17
      Hi Azazel,

      I agree with you that the "differential expression of whole genes taken from UCSC" approach does ignore some important information provided by RNA-seq. However, I do not agree that this means the approach is without value.

      The alternative you suggest is to use the data itself to produce an annotation, against which your analysis can proceed. There will be circumstances where this is indeed a superior approach, but this will not always be the case. Firstly, the ability to annotate a gene depends on the level of coverage, which for RNA-seq depends on the level of transcription. By relying on de-novo annotation routines you will ignore, or at the very least bias against, lowly expressed genes. There are certainly situations where the accurate identification of differential expression in lowly expressed genes is vital to the biological question driving the experiment. The example I am familiar with is known-down experiments of polycomb group proteins, where small expression changes in lowly expressed genes are important, but I am sure there are many others. A related point is that smaller experiments may lack the depth for accurate de novo annotation, even for highly expressed genes.

      Furthermore, there is obviously extra information that is still ignored by doing differential expression against de-novo annotated genes, such as differential splicing or allele specific expression.

      Ultimately, there are many different things that can be done with RNA-seq data and it seems to me that different analysis techniques offer complementary information rather than opposing, mutually exclusive viewpoints. How important different aspects of RNA-seq analysis are will depend on the biological question you are trying to answer.

      The point of this guide is not to be a "one size fits all" guide to analyzing RNA-seq, but to provide a step by step introduction to one of the simpler (and possibly more well understood) analysis methods available. It was my original hope (and it still is) that this guide could form the basis for some kind of "RNA-seq analysis wiki", where people with more expertise with other areas of the analysis could add to it. For example, I think a section describing how to create an annotation using a reference genome and some RNA-seq data would be extremely useful, but I don't have the expertise to write it myself.

      Comment


      • #18
        Hi Matt,

        Really really useful. Thank you.

        I was wondering if you, or anyone else reading this, knew of a guide/workflow on novel transcripts/RNA editing, similar to the level of excellence as this guide.

        Comment


        • #19
          Been checking out the tutorial that was posted on this thread. Can anyone comment what the command:

          new_read_chr_names=gsub("(.*)[T]*\\..*","chr\\1",rname(reads))

          is doing? I get the making of a new list of chromosome names and that the eaxmple uses gsub to do the substitutions, but I don't under stand what's going on in the first two fields of that command:

          "(.*)[T]*\\..*"

          and

          "chr\\1"

          In other words, I have no idea how to make sure that my chromosomal names will match up with ones in the genome (NCBI headers put in a bunch of noise) because of my naivete about syntax. Any thoughts? Thanks.

          Comment


          • #20
            Hi,

            Thanks a lot

            This is really helpful for a fresher in bioinformatics like me.

            I hope our seniors will take MDY's idea of developing this further.
            I know many are working hard to develop such material but there is limited access to it.

            Please keep us informed of any developments.

            Nyine

            Comment


            • #21
              Hi,
              This seems to be a very good walk through. One thing though, why did you use the R package to check for differential expression as opposed to using cufflinks? In my experience, I have found that cufflinks is very easy to use, since you are producing sam files from bowtie and both programs are developed by UMD.

              Just wondering what you thought!

              Thanks!

              Andrew

              Comment


              • #22
                Gunzip - I unfortunately know of no such guide. Sorry.

                JueFish - This command uses regular expressions to rename chromosomes. There are many tutorials on the internet on how to use regular expressions (see http://www.regular-expressions.info/ for example) and the R help for gsub has some useful info too. The point is basically that I have chromosomes named like this:
                10.1-129993255
                11.1-121843856
                12.1-121257530
                13.1-120284312
                14.1-125194864
                15.1-103494974
                16.1-98319150
                17.1-95272651
                18.1-90772031
                19.1-61342430
                2.1-181748087
                3.1-159599783
                4.1-155630120
                5.1-152537259
                6.1-149517037
                7.1-152524553
                8.1-131738871
                9.1-124076172
                MT.1-16299
                X.1-166650296
                Y.1-15902555
                1.1-197195432
                But my annotation files have them named like this:
                chr1
                chr10
                chr11
                chr12
                chr13
                chr14
                chr15
                chr16
                chr17
                chr18
                chr19
                chr2
                chr3
                chr4
                chr5
                chr6
                chr7
                chr8
                chr9
                chrM
                chrX
                chrY
                The regular expression just converts the names. You don't have to use a regular expression, it's just a time saving tool.


                plassaaw - cufflinks/cuffdiff does not do the same thing as an edgeR analysis. Both are useful in different circumstances (see my response to Azazel's post).

                Comment


                • #23
                  Very nice guide!

                  I do have one point- the maketranscriptDb function in GenomicFeatures seems like it is only useful for model organisms (at least, I am not smart enough to make it work for my organism.)

                  I have managed to use the girafe package to do something similar using just the gff. I can write up something if people think it would be useful.

                  Comment


                  • #24
                    Excellent guide! Looking forward for ChIP-seq analysis also...

                    Comment


                    • #25
                      Thanks for the useful resource!
                      --
                      bioinfosm

                      Comment


                      • #26
                        --------------------------------------------------------------------------------

                        Hi Matt,

                        Really really useful. Thank you

                        Comment


                        • #27
                          non-model organisms & GenomicFeatures

                          Originally posted by ge_SF View Post
                          Very nice guide!

                          I do have one point- the maketranscriptDb function in GenomicFeatures seems like it is only useful for model organisms (at least, I am not smart enough to make it work for my organism.)

                          I have managed to use the girafe package to do something similar using just the gff. I can write up something if people think it would be useful.
                          Yes, this would be useful! Please post, it would be much appreciated. I'm working on libraries from maize, which is not on UCSC. Not sure exactly how I would change the maketranscriptDb function to direct it to the maize database, so any way around this would be helpful.

                          Great thread by the way, very helpful!

                          karl

                          Comment


                          • #28
                            My method for non-model organisms

                            This will get you to the point of having a count table you can use in edgeR. Of course, you can also use tophat/cufflinks with your gff file. I think ideally using both methods will allow for a good comparison.

                            Please let me know if I left out anything or it is unclear. I did assume some knowledge of R, so it may not be suitable for pure beginners.

                            Note: I have revised the pdf to eliminate some typos and (hopefully) make more clear. Please let me know if there are still errors!
                            Attached Files
                            Last edited by ge_SF; 04-05-2011, 08:56 AM. Reason: Revised pdf

                            Comment


                            • #29
                              I have girafe installed but it cannot find the function 'agiFromBam'. Other functions from the girafe library do work though. I am new to this kind of analysis and not an expert in R so I apologize if this is something very simple. When I "library(help = girafe)" to get the list of functions 'agiFromBam' is not listed. Where should I be getting it from? Any tips?

                              Thanks!

                              Comment


                              • #30
                                Hmm not sure what the problem is, it is definitely there in my function list. Is it possible it exists but isn't being listed for some reason? Try typing: ?agiFromBam to see if the help page comes up (remember, cases in R must match so if you accidently type AgifromBam or something it won't work).

                                What version of girafe are you using? This was done with 1.2.0.

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Techniques and Challenges in Conservation Genomics
                                  by seqadmin



                                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                                  Avian Conservation
                                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                                  03-08-2024, 10:41 AM
                                • seqadmin
                                  The Impact of AI in Genomic Medicine
                                  by seqadmin



                                  Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
                                  02-26-2024, 02:07 PM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 03-14-2024, 06:13 AM
                                0 responses
                                32 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-08-2024, 08:03 AM
                                0 responses
                                71 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-07-2024, 08:13 AM
                                0 responses
                                80 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 03-06-2024, 09:51 AM
                                0 responses
                                68 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X