Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GFF file formatting

    Hello,

    I downloaded some GFF files from ensembl website and only need a small subset of rows in these files. So, I opened them up in Excel and chose whatever I wanted.

    I saved them as tab delimited text files and everything looked ok. But, I am having trouble with downstream analysis.

    My question is, are there any software to check the formatting of my edited GFF files? If so, I will be really happy if you could share them with me.

    Also, is there any better way to edit GFF files than opening them in Excel. I heard line endings could also cause some problems between Mac and Linux.

    Thank you,
    Neel

  • #2
    There are some tools for manipulating gff files;
    If you have a little of experience with perl you could use them.

    Comment


    • #3
      Originally posted by naluru View Post
      Also, is there any better way to edit GFF files than opening them in Excel.
      If you are on linux you can edit text files using grep, gawk, sed, perl, etc.
      There are nice linux tutorials around.

      Originally posted by naluru View Post
      I heard line endings could also cause some problems between Mac and Linux.
      I tried this:
      http://www.google.com/search?hl=en&s...l=&oq=dos2unix
      http://www.google.com/search?hl=en&s...ql=&oq=mac2lin

      Comment


      • #4
        I've tried to do something similar with gff file and it works fine for me. Just need to make sure that the information in each column in excel spreadsheet is ok. I coppied it into 010 Editor and saved as a gff file(I did it only because I had too many rows to put it all into one spreadsheet).

        Comment


        • #5
          Another reason why opening an annotation file with Excel should be avoided: gene names can be automatically changed

          Comment


          • #6
            Quick way to tell if it is an end of line issue - if you type: more my_file from the command line you will see the funky EOL characters that you won't necessarily see just opening the txt file.

            There are methods to fix the problem, but I agree with steven that the best thing is to do it in unix. Most likely grep is what you need (to select certain rows, based on whether they have the pattern you are looking for). cut may also be useful- this will select by column. There are plenty of online guides for these.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin




              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
              04-22-2024, 07:01 AM
            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 11:49 AM
            0 responses
            13 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-24-2024, 08:47 AM
            0 responses
            16 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            61 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            60 views
            0 likes
            Last Post seqadmin  
            Working...
            X