Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • prussiap
    Junior Member
    • May 2012
    • 9

    TCGA cancer data, and bioinformatics design questions for SNP/ mirna analysis

    I'm looking for some help either developing a pipeline or using the proper tools and the correct data sets for the below (see goal).

    My languages of choice would be in python/R .

    Goal: I'm looking to create a disease specific profile of just SNPs and SNPs in miRNAs and miRNA target sites. Ideally I would get chromosome information, location and how the various SNPs mentioned above interact for a disease profile.

    PART 1: TCGA
    My first problem is using TCGA data which lists a ton of abhorrent mutations in a LOH .txt format. I'd like to be able to map those mutations to SNP's or genes or miRNA (whatever entities they belong to). The TCGA datasheet is here. Example data is here for breast cancer. I guess I can use the miRNA and mRNA data as well from there.

    Questions here:
    1. How to decipher the LOH data to figure out if it's meaningful and where it maps?
    2. Which tools to use for mapping and what formats for the final data ? Fasta ?
    3. miRNA/Targets and SNPs Next up is getting cancer specific miRNAs and mRNAs and mapping SNPs to them? I'm assuming using dbSNP or Sanger miRNA databases to get miRNA/targets and seed sequences.


    Part 2:
    I'm a bit lost as how to combine all these pieces of information, what formats to use for output (linked to individual pieces) and which tools if any to use to gather all this data using python. This tool is useful as well I think, mirdsnp.

    Any help for how to combine all this data, best practices for mapping snps and miRNAs etc.. and if there are any biopython/bioconductor tools or approaches. I'm having trouble with where to start, how to parse LOH files to get meaningful data out and how to combine it with the other tools..

    I'm doing this in an exploratory method so that I can use this information to design understand experiments later on.

Latest Articles

Collapse

  • SEQadmin2
    Nine Things a Sample Prep Scientist Thinks About Before Sequencing
    by SEQadmin2


    I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.


    Here are nine questions we think about, in roughly the order they matter, before...
    06-18-2026, 07:11 AM
  • SEQadmin2
    From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
    by SEQadmin2


    Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


    The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
    ...
    06-02-2026, 10:05 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by SEQadmin2, 06-17-2026, 06:09 AM
0 responses
30 views
0 reactions
Last Post SEQadmin2  
Started by SEQadmin2, 06-09-2026, 11:58 AM
0 responses
44 views
0 reactions
Last Post SEQadmin2  
Started by SEQadmin2, 06-05-2026, 10:09 AM
0 responses
50 views
0 reactions
Last Post SEQadmin2  
Started by SEQadmin2, 06-04-2026, 08:59 AM
0 responses
51 views
0 reactions
Last Post SEQadmin2  
Working...