SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Updated How to convert .txt file to .bed .GFF or .BAR file format, forevermark4 Bioinformatics 2 06-30-2014 06:02 AM
GFF Annotation file AAWT Bioinformatics 3 12-12-2012 12:58 AM
Convert segemehl *.map file into gff file. satheshsiva Bioinformatics 0 07-16-2010 05:40 AM
Concatenate GFF file lellabioinfo SOLiD 3 07-21-2009 11:31 AM
GFF file for TopHat joseph RNA Sequencing 2 06-15-2009 01:46 AM

Reply
 
Thread Tools
Old 03-22-2011, 12:39 PM   #1
naluru
Member
 
Location: Woods Hole, Massachusetts

Join Date: Jul 2010
Posts: 16
Default GFF file formatting

Hello,

I downloaded some GFF files from ensembl website and only need a small subset of rows in these files. So, I opened them up in Excel and chose whatever I wanted.

I saved them as tab delimited text files and everything looked ok. But, I am having trouble with downstream analysis.

My question is, are there any software to check the formatting of my edited GFF files? If so, I will be really happy if you could share them with me.

Also, is there any better way to edit GFF files than opening them in Excel. I heard line endings could also cause some problems between Mac and Linux.

Thank you,
Neel
naluru is offline   Reply With Quote
Old 03-22-2011, 12:57 PM   #2
joscarhuguet
Member
 
Location: USA

Join Date: Feb 2010
Posts: 18
Default

There are some tools for manipulating gff files;
If you have a little of experience with perl you could use them.

http://biowiki.org/GffTools#GFF_tools
joscarhuguet is offline   Reply With Quote
Old 03-23-2011, 07:05 AM   #3
steven
Senior Member
 
Location: Southern France

Join Date: Aug 2009
Posts: 269
Default

Quote:
Originally Posted by naluru View Post
Also, is there any better way to edit GFF files than opening them in Excel.
If you are on linux you can edit text files using grep, gawk, sed, perl, etc.
There are nice linux tutorials around.

Quote:
Originally Posted by naluru View Post
I heard line endings could also cause some problems between Mac and Linux.
I tried this:
http://www.google.com/search?hl=en&s...l=&oq=dos2unix
http://www.google.com/search?hl=en&s...ql=&oq=mac2lin
steven is offline   Reply With Quote
Old 03-29-2011, 08:04 AM   #4
rudi283
Member
 
Location: europe

Join Date: Sep 2010
Posts: 27
Default

I've tried to do something similar with gff file and it works fine for me. Just need to make sure that the information in each column in excel spreadsheet is ok. I coppied it into 010 Editor and saved as a gff file(I did it only because I had too many rows to put it all into one spreadsheet).
rudi283 is offline   Reply With Quote
Old 03-29-2011, 08:33 AM   #5
steven
Senior Member
 
Location: Southern France

Join Date: Aug 2009
Posts: 269
Default

Another reason why opening an annotation file with Excel should be avoided: gene names can be automatically changed
steven is offline   Reply With Quote
Old 03-29-2011, 12:21 PM   #6
ge_SF
Member
 
Location: SF Bay Area

Join Date: Mar 2011
Posts: 11
Default

Quick way to tell if it is an end of line issue - if you type: more my_file from the command line you will see the funky EOL characters that you won't necessarily see just opening the txt file.

There are methods to fix the problem, but I agree with steven that the best thing is to do it in unix. Most likely grep is what you need (to select certain rows, based on whether they have the pattern you are looking for). cut may also be useful- this will select by column. There are plenty of online guides for these.
ge_SF is offline   Reply With Quote
Reply

Tags
gff, gff2, gff3

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:35 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO