Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • TCGA cancer data, and bioinformatics design questions for SNP/ mirna analysis

    I'm looking for some help either developing a pipeline or using the proper tools and the correct data sets for the below (see goal).

    My languages of choice would be in python/R .

    Goal: I'm looking to create a disease specific profile of just SNPs and SNPs in miRNAs and miRNA target sites. Ideally I would get chromosome information, location and how the various SNPs mentioned above interact for a disease profile.

    PART 1: TCGA
    My first problem is using TCGA data which lists a ton of abhorrent mutations in a LOH .txt format. I'd like to be able to map those mutations to SNP's or genes or miRNA (whatever entities they belong to). The TCGA datasheet is here. Example data is here for breast cancer. I guess I can use the miRNA and mRNA data as well from there.

    Questions here:
    1. How to decipher the LOH data to figure out if it's meaningful and where it maps?
    2. Which tools to use for mapping and what formats for the final data ? Fasta ?
    3. miRNA/Targets and SNPs Next up is getting cancer specific miRNAs and mRNAs and mapping SNPs to them? I'm assuming using dbSNP or Sanger miRNA databases to get miRNA/targets and seed sequences.


    Part 2:
    I'm a bit lost as how to combine all these pieces of information, what formats to use for output (linked to individual pieces) and which tools if any to use to gather all this data using python. This tool is useful as well I think, mirdsnp.

    Any help for how to combine all this data, best practices for mapping snps and miRNAs etc.. and if there are any biopython/bioconductor tools or approaches. I'm having trouble with where to start, how to parse LOH files to get meaningful data out and how to combine it with the other tools..

    I'm doing this in an exploratory method so that I can use this information to design understand experiments later on.

Latest Articles

Collapse

  • seqadmin
    Strategies for Sequencing Challenging Samples
    by seqadmin


    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
    03-22-2024, 06:39 AM
  • seqadmin
    Techniques and Challenges in Conservation Genomics
    by seqadmin



    The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

    Avian Conservation
    Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
    03-08-2024, 10:41 AM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 03-27-2024, 06:37 PM
0 responses
13 views
0 likes
Last Post seqadmin  
Started by seqadmin, 03-27-2024, 06:07 PM
0 responses
11 views
0 likes
Last Post seqadmin  
Started by seqadmin, 03-22-2024, 10:03 AM
0 responses
53 views
0 likes
Last Post seqadmin  
Started by seqadmin, 03-21-2024, 07:32 AM
0 responses
69 views
0 likes
Last Post seqadmin  
Working...
X