Seqanswers Leaderboard Ad

**TheSeqGeek** · 02-10-2015, 12:14 PM

I tried using excel and quickly realized I need to run loops.

**colindaven** · 02-12-2015, 06:28 AM

This sounds like a problem you can use Galaxy for

**sarvidsson** · 02-12-2015, 07:07 AM

So you have the chromosome and start/stop for your ~400 positions? Put them in BED format (tab-separated lines with "chromosome start stop"), get a GFF/GTF file for your genome with the genes (possibly filter it with grep for the features you are interested in) and use BEDTools (a swiss army knife for all annotation comparison needs); e.g. the "closest" command:

closest — bedtools 2.31.0 documentation

http://bedtools.readthedocs.org/en/latest/content/tools/closest.html

**TheSeqGeek** · 02-13-2015, 12:57 PM

I did as "sarvidsson" suggested

Both files contain chromosome name, start position, stop position, and name of feature/gene without headings

Here is an example

My list of 400 position are in the following format called "toanno.bed"
Chromosome 2985 2998 Site1
Chromosome 6738 6751 Site2

My list of genes I want to match them with are in the following format called "genome.bed"
Chromosome 351 1724 Gene1
Chromosome 1828 2946 Gene2

When I use the command
closestBed -a toanno.bed -b genome.bed > features.bed

I get a concatenated file containing both files head to tail... basically a long concatenate command...

I figured out I am not putting into .bed format. Basically the problem is with unicode.

Save your data with excel, which only does Unicode 16 then save it as Unicode 8. WoW ridiculous.

**AlliCox** · 02-18-2015, 08:57 PM

You could probably annotate the base pair positions using a tool that annotates lists of variants from NGS - if the position is near a gene, it would get annotated as upstream, downstream, intronic, etc. That would probably work for some of the positions. You could also align the bp positions to annotation information from 1000 genomes to find out if the site is in or near a gene.

**TheSeqGeek** · 02-19-2015, 05:52 AM

Originally posted by AlliCox View Post

You could probably annotate the base pair positions using a tool that annotates lists of variants from NGS .

So what's the tool?

**sarvidsson** · 02-19-2015, 06:05 AM

Originally posted by TheSeqGeek View Post

So what's the tool?

You could use SnpEff, but then you'd need to fake some VCF to get there. BEDTools is the tool for the job.

**TheSeqGeek** · 02-19-2015, 06:07 AM

Originally posted by sarvidsson View Post

You could use SnpEff, but then you'd need to fake some VCF to get there. BEDTools is the tool for the job.

Yeah, I already got it to work with bed tools closestBed command. Only issue was with type of text editor I was using to generate .bed file as I described for anyone else having similar issues.

Topics	Statistics	Last Post
Evaluating Genome Sequencing for ECMO Patients in the NICU by seqadmin Started by seqadmin, 12-17-2024, 10:28 AM	0 responses 39 views 0 likes	Last Post by seqadmin 12-17-2024, 10:28 AM
New Genetic Toolkit Refines Studies on Gene Function and Disease by seqadmin Started by seqadmin, 12-13-2024, 08:24 AM	0 responses 52 views 0 likes	Last Post by seqadmin 12-13-2024, 08:24 AM
Study Links Brain Mechanism to Emotional Responses in Animals and Humans by seqadmin Started by seqadmin, 12-12-2024, 07:41 AM	0 responses 38 views 0 likes	Last Post by seqadmin 12-12-2024, 07:41 AM
Study Identifies Ribosomal RNA Fingerprints as Early Cancer Biomarkers by seqadmin Started by seqadmin, 12-11-2024, 07:45 AM	0 responses 46 views 0 likes	Last Post by seqadmin 12-11-2024, 07:45 AM

Seqanswers Leaderboard Ad

Announcement

Promoter Analysis

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News