Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • DEG analysis without gff/gtf file

    My goal is to see differentially expressed genes across different time points.
    However, I want to map allreads based solely on sequence and not on where they map to, because it is not certain whether my annotation of ghe genome is correct or complete. So I do not want to use an annotation.

    In this case, after running tophat without "-g option",
    what approaches could be used in the next step othar than HTSeq or cufflinks/cuffdiff?

    I have been told that cufflink/cuffdiff is not so powerful to see DEG, and have been advised to use HTSeq/EdgeR/DESeq. However, HTSeq requires GFF as an input file. So I need to take another approach. Would you please give me tips about what other programs could be used in my case?

    Thanks in advance.
    Last edited by syintel87; 01-07-2013, 09:06 AM.

  • #2
    Hi syintel87,

    I have been recently looking for a pipeline for RNA-Seq analysis and had the same doubt as you. As far as I know, in all cases (whether de novo assembly or reference-based mapping) you are going to need a GFF3/GTF file.

    Bernardo

    Comment


    • #3
      how to get DEG without gtf/gff?

      Is there a way to achieve my goal which is to see differentially expressed genes across different time points, without gff/gtf file?

      If I use the annotated file, reads will only map to annotated reads. This will exclude any reads that map to genes that have yet to be annotated.

      Comment


      • #4
        Originally posted by syintel87 View Post
        Is there a way to achieve my goal which is to see differentially expressed genes across different time points, without gff/gtf file?

        If I use the annotated file, reads will only map to annotated reads. This will exclude any reads that map to genes that have yet to be annotated.
        Well, at some point programs like rQuant, rDiff, DESeq or Cuffdiff are going to need a file with transcripts in order to quantify them in the *.bam files.

        Maybe there are other tools GFT/GFF3-independent that I still don't know.


        Bernardo

        Comment


        • #5
          Even if you do not use Cuffdiff for the DE analysis, you can run Cufflinks on your samples to get sample-specific .gtf files. These annotations (which can contain novel transcripts/genes) can be merged afterwards with a reference .gtf file that you prefer (e.g. Ensembl's) using Cuffmerge, and you can use the resulting merged .gtf file for the DESeq/edgeR analyses.

          Comment


          • #6
            Originally posted by adumitri View Post
            Even if you do not use Cuffdiff for the DE analysis, you can run Cufflinks on your samples to get sample-specific .gtf files. These annotations (which can contain novel transcripts/genes) can be merged afterwards with a reference .gtf file that you prefer (e.g. Ensembl's) using Cuffmerge, and you can use the resulting merged .gtf file for the DESeq/edgeR analyses.
            Oh!!! How helpful it is!!!
            Thank you so much!!!!!!!!!
            That GFF file is what I exactly want to have!!!

            Comment


            • #7
              Originally posted by syintel87 View Post
              Is there a way to achieve my goal which is to see differentially expressed genes across different time points, without gff/gtf file?
              Well... it may sound silly but to identify *differentially expressed* genes you need to identify *genes*.
              Either you provide them as known data in the form of an annotation file (GTF/GFF/BED/etc) or you'll have to infer them from the reads, which is a very challenging task if you expect complete gene models. You typically get differentially expressed "genomic regions" -aka "transcribed fragments" (transfrags), "transcriptionally active regions" (TAR), etc and not complete "genes".

              As adumitri indicated you can use cufflinks (or BEDtools) to extract those transcribed regions from the mapped reads and merge them with some reference annotation so that you can probe known and unknown regions.
              I would just recommend to merge the reads from all the samples altogether -along with the reference annotation- so that the statistical method you choose next will consider the exact same set of regions across samples/conditions. You should then find differentially expressed regions. Now defining if two transcribed regions belong to the same gene/transcript is another question.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Advancing Precision Medicine for Rare Diseases in Children
                by seqadmin




                Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                12-16-2024, 07:57 AM
              • seqadmin
                Recent Advances in Sequencing Technologies
                by seqadmin



                Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                Long-Read Sequencing
                Long-read sequencing has seen remarkable advancements,...
                12-02-2024, 01:49 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 12-17-2024, 10:28 AM
              0 responses
              26 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 12-13-2024, 08:24 AM
              0 responses
              42 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 12-12-2024, 07:41 AM
              0 responses
              28 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 12-11-2024, 07:45 AM
              0 responses
              42 views
              0 likes
              Last Post seqadmin  
              Working...
              X