Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Inquiry: edgeR, DESeq, and baySeq algorithms explained?

    Hi,

    I have read the original papers for edgeR, DESeq, and baySeq. Can anyone please point me in the direction of any additional reading which can explain how these algorithms/methods work from a high overview? I have also read their manual and vingette on bioconductor. I am okay with discussing some mathematical details, but I felt the original papers weren't as clear/detailed as I liked on discussing how these methods worked.

    If you do not know of a resource, but would be willing to spend the time to explain it in your own words, that would also be appreciated. However, I really would like to read some more resources on how these methods actually work.

    Thanks,
    GeekyOmega

  • #2
    I can recommend going through the code, that'll give you all of the detail you likely want (at least the DESeq2 code is nicely commented, making it relatively easy to follow along).

    Comment


    • #3
      hey guys,
      i have never used edgeR, DESeq, and baySeq packages before, but now i need to use them analyze my RAN-seq data. my question is can i run these packages on my laptop to analysis differential gene expression of my data or should i run them on server?
      in my data, all assembled transcripts are more than 50,000 and raw reads are more than 80,000,000.

      Comment


      • #4
        At this point in time you will have the data mapped and summarized in a raw number (counts) matrix (50000 x some number of samples). You should be able to run DE analysis on your laptop (assuming you have enough RAM available).

        Comment


        • #5
          Originally posted by GenoMax View Post
          At this point in time you will have the data mapped and summarized in a raw number (counts) matrix (50000 x some number of samples). You should be able to run DE analysis on your laptop (assuming you have enough RAM available).
          thanks @GenoMax

          my following questions might be ridiculously easy, but there is no one in my lad to give some tips about these analysis. so i am going to ask them here anyway, please bear with me !
          could you also recommend a tool or some manual links that could do the "data mapped and summarized in a raw number (counts) matrix (50000 x some number of samples)" u mentioned above?
          and my laptop RAM is 4GB , does that enough for DE analysis for the data?
          thanks.

          Comment


          • #6
            If you're starting from fastq files rather than a matrix of counts (what GenoMax mentioned) then you'll want to just use a different computer. One option would be the public Galaxy instance, where you could presumably use tophat2 or maybe STAR for the mapping step and featureCounts or htseq-count for the counting part. Having said that, you're probably best off trying to find a collaborator that you can offload this onto. You're going to get a better quality analysis if you work with someone that's already familiar with some of the nuances of NGS data analysis.

            Comment


            • #7
              Originally posted by kurban910 View Post
              and my laptop RAM is 4GB , does that enough for DE analysis for the data?
              thanks.
              In that case you may just want to stick with the server you were using before. DE analysis may work with 4G RAM but you will find out quickly if it does not.

              If I recall right your mapping is already done, correct?

              Comment


              • #8
                not yet, i just assembled the reads got the trinity.fasta file.

                Comment


                • #9
                  Originally posted by kurban910 View Post
                  not yet, i just assembled the reads got the trinity.fasta file.
                  Then definitely work on the server for the alignments and the rest. Use the laptop as a terminal to access the server.

                  Comment


                  • #10
                    Hi @dpryan
                    Originally posted by dpryan View Post
                    If you're starting from fastq files rather than a matrix of counts (what GenoMax mentioned) then you'll want to just use a different computer. One option would be the public Galaxy instance, where you could presumably use tophat2 or maybe STAR for the mapping step and featureCounts or htseq-count for the counting part. Having said that, you're probably best off trying to find a collaborator that you can offload this onto. You're going to get a better quality analysis if you work with someone that's already familiar with some of the nuances of NGS data analysis.
                    yes , i am starting with fasta file. i have no problem with reads alignment and mapped reads counting, but R is new to me . today i have read the edgeR user guide and still did not know how to put into data in edgeR.
                    Last edited by kurban910; 06-10-2015, 09:36 AM.

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Essential Discoveries and Tools in Epitranscriptomics
                      by seqadmin




                      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                      04-22-2024, 07:01 AM
                    • seqadmin
                      Current Approaches to Protein Sequencing
                      by seqadmin


                      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                      04-04-2024, 04:25 PM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, Yesterday, 11:49 AM
                    0 responses
                    13 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-24-2024, 08:47 AM
                    0 responses
                    16 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-11-2024, 12:08 PM
                    0 responses
                    61 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-10-2024, 10:19 PM
                    0 responses
                    60 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X