Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Generation of BED/GTF/GFF-files

    Dear All,

    I am relatively new to RNA-seq data analysis. I did my share of literature but theory and practice are two different things. So I encounter small problems/questions on which I cannot find straight forward answers. Here is one, hope you can help.

    I have aligned my reads to the reference genome (hg19) with TopHat2 and now I want to use DeSeq(2) to identify differentially expressed genes. Obviously a genome annotation file (in this case GFF3) is needed. I wonder what the best solution is. Getting the annotation file from e.g. UCSC? Or generate one from the BAM-files I have, thus converting BAM to BED to GFF3 (to GTF)? In the latter case, how do I deal with the fact that I have multiple BAM-files, i.e. one per sample. I expect differential gene expression so I guess that a GFF3 generated from a BAM file of condition 1 will be different from those for condition 2.

    Thanks for your help.

    Steven

  • #2
    Where did you get your hg19 genome from? It would be best to get the GTF file from the same source.

    One solution is to get GTF file from iGenomes. The downloads are large but the files (sequences, annotation, indexes are all in sync as far as the names etc goes) and this will save you time down the road by avoiding problems with annotations etc.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Essential Discoveries and Tools in Epitranscriptomics
      by seqadmin




      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
      04-22-2024, 07:01 AM
    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, Today, 08:47 AM
    0 responses
    11 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    60 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    59 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 09:21 AM
    0 responses
    54 views
    0 likes
    Last Post seqadmin  
    Working...
    X