Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
I would like to know basic pipeline for analysis of miRNA from SOLID seq data ? unique379 SOLiD 3 05-13-2013 09:52 PM
Redundancy in TCGA data dsmarcoantonio Bioinformatics 1 04-06-2013 09:33 AM
TCGA data analysis details skilpinen Bioinformatics 4 03-07-2013 03:08 PM
PubMed: miRNA Data Analysis: Next-Gen Sequencing. Newsbot! Literature Watch 0 12-07-2011 07:50 AM
PubMed: Statistical Design and Analysis of RNA-Seq Data. Newsbot! Literature Watch 0 05-09-2010 08:00 PM

Thread Tools
Old 05-18-2013, 03:42 PM   #1
Junior Member
Location: 94608

Join Date: May 2012
Posts: 6
Default TCGA cancer data, and bioinformatics design questions for SNP/ mirna analysis

I'm looking for some help either developing a pipeline or using the proper tools and the correct data sets for the below (see goal).

My languages of choice would be in python/R .

Goal: I'm looking to create a disease specific profile of just SNPs and SNPs in miRNAs and miRNA target sites. Ideally I would get chromosome information, location and how the various SNPs mentioned above interact for a disease profile.

My first problem is using TCGA data which lists a ton of abhorrent mutations in a LOH .txt format. I'd like to be able to map those mutations to SNP's or genes or miRNA (whatever entities they belong to). The TCGA datasheet is here. Example data is here for breast cancer. I guess I can use the miRNA and mRNA data as well from there.

Questions here:
  1. How to decipher the LOH data to figure out if it's meaningful and where it maps?
  2. Which tools to use for mapping and what formats for the final data ? Fasta ?
  3. miRNA/Targets and SNPs Next up is getting cancer specific miRNAs and mRNAs and mapping SNPs to them? I'm assuming using dbSNP or Sanger miRNA databases to get miRNA/targets and seed sequences.

Part 2:
I'm a bit lost as how to combine all these pieces of information, what formats to use for output (linked to individual pieces) and which tools if any to use to gather all this data using python. This tool is useful as well I think, mirdsnp.

Any help for how to combine all this data, best practices for mapping snps and miRNAs etc.. and if there are any biopython/bioconductor tools or approaches. I'm having trouble with where to start, how to parse LOH files to get meaningful data out and how to combine it with the other tools..

I'm doing this in an exploratory method so that I can use this information to design understand experiments later on.
prussiap is offline   Reply With Quote

bioinformatic analaysis, mirna identification, mirna library, snp chip 6.0

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 10:35 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO