SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
Plotting ChIP-seq read profiles relative to genomic features Gilman85 Bioinformatics 7 12-07-2012 10:45 AM
Excel file with Blast info Chuckytah Bioinformatics 1 04-24-2011 08:07 PM
Dmel Genomic Features files chrishawk Bioinformatics 1 11-23-2010 11:00 AM
BEDTools: A flexible suite of utilities for comparing genomic features nilshomer Literature Watch 5 02-01-2010 10:36 AM
Export HCDiff file from ReferenceMapper as an Excel file Nigel Bioinformatics 1 09-30-2009 07:51 AM

Reply
 
Thread Tools
Old 01-30-2012, 05:43 PM   #1
ohsu
Junior Member
 
Location: oregon

Join Date: Jan 2012
Posts: 3
Default from excel file to genomic features

Hi, everyone,

I'm totally new to RNA-seq. I have collaborated with other lab to sequence one Chip-seq library. Now I have some half-analyzed tags data in excel file
the column title like this, totally about 1000 rows:
Chr CenterMeter SummedHeight ChrStart ChrEnd.


I'm wondering how I start from these data to generate the genomic feature of these tags?

1 how to translate xls files to bed or other files?
2 how I can start from the tags chromosomal location files to get while tags genomic features distribution, like Transcription units, CpG islands, repeats, et.al.

I can use a lot bit of R language not a expert.

Thanks a lot for your help!
ohsu is offline   Reply With Quote
Old 01-31-2012, 12:08 AM   #2
arvid
Senior Member
 
Location: Berlin

Join Date: Jul 2011
Posts: 156
Default

1. The BED format is simple:
http://genome.ucsc.edu/FAQ/FAQformat.html#format1
You can write that with write.table in R. You can read an Excel sheet with e.g. read.xls from the gdata package.

2. Could you re-write this question more clearly? If you mean that you want to know how to compare the genomic locations with known genomic features, you can use e.g. intersectBed from BEDtools, if you have the features in a GFF or BED file.
arvid is offline   Reply With Quote
Old 01-31-2012, 11:55 AM   #3
ohsu
Junior Member
 
Location: oregon

Join Date: Jan 2012
Posts: 3
Default

HI, Arvid, Thanks a lot for your help.

I only have the tags' genomic location in excel files, like chr 3, 1398765. I want to get the genomic features of these location. like are these tags prefer clustering at CpGi sland or transcription units or other regions ?

Thanks a lot!
ohsu is offline   Reply With Quote
Old 01-31-2012, 12:05 PM   #4
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,081
Default

You should be able to use the "table browser" from UCSC (http://genome.ucsc.edu) to get the genomic features info.

See the help page here: http://genome.ucsc.edu/goldenPath/he...ablesHelp.html

Tutorials available here: http://www.openhelix.com/cgi/tutorialInfo.cgi?id=28
GenoMax is offline   Reply With Quote
Old 01-31-2012, 02:13 PM   #5
ohsu
Junior Member
 
Location: oregon

Join Date: Jan 2012
Posts: 3
Default

hi, I guess I didn't give enough information. I have a file like this:

chrom chromStart chromEnd
chr13 38021582 38023245
chr11 22920116 22921002
chr15 91737068 91740707
chr18 48206683 48207725
chr9 78326184 78327354
chr10 57693494 57697038
chrX 71688149 71689220
chrX 130684504 130685256
chr4 149618312 149620001
chr11 3031766 3033033
chr1 193808626 193809612
chr13 94710571 94711334
chr9 64025568 64026410
chr17 45705704 45706771
......
......

My question is how I can get the genomic features of these location?

I'm pretty new to bioinformatics. A little bit of more detail will be much appreciated.

Thanks a lot!

ps: thanks GenoMax, I tried your link, however I don't know where to input the csv file into the genome browser. looks they need a table for identifiers (names/accessions) table.
ohsu is offline   Reply With Quote
Old 01-31-2012, 11:33 PM   #6
arvid
Senior Member
 
Location: Berlin

Join Date: Jul 2011
Posts: 156
Default

Quote:
Originally Posted by ohsu View Post
hi, I guess I didn't give enough information. I have a file like this:

chrom chromStart chromEnd
chr13 38021582 38023245
chr11 22920116 22921002
......

My question is how I can get the genomic features of these location?
Re-format that into a BED file (a text file containing that same data, with the column separated by tabs). Then get a GFF with the genomic features you would like to compare to, making sure that the chromosome naming is the same. Then get the overlap of the two files with intersectBed from BEDTools, with the following command:

Code:
intersectBed -a your_chr_loc.bed -b genomic_features.gff > overlapping_features.gff
You can then load the GFF into a spreadsheet or into a genome browsing tool like IGV...
arvid is offline   Reply With Quote
Reply

Tags
chip-seq, excel file, genomic feature, integration sites

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:08 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO