![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Heatmap question | lucer105 | Bioinformatics | 0 | 12-27-2013 10:01 PM |
First Question | njt638 | General | 1 | 08-24-2011 12:56 PM |
Question | cjose | Illumina/Solexa | 4 | 08-11-2011 06:31 AM |
question? | semna | Bioinformatics | 7 | 12-20-2010 05:30 AM |
question? | semna | Bioinformatics | 1 | 12-17-2010 02:54 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Member
Location: Boston Join Date: Dec 2014
Posts: 10
|
![]()
Hello,
I have a project that requires access to an "online cancer related DNA database resource." Unfortunately, it doesn't look like I can get access to CGHub. Would someone mind helping me? (i.e. perhaps suggest one that I can access) Thank you! |
![]() |
![]() |
![]() |
#2 |
Member
Location: Cordoba, Spain Join Date: Feb 2013
Posts: 21
|
![]()
www.cartagenia.com
You don't mention if it should be of free access or not |
![]() |
![]() |
![]() |
#3 |
Member
Location: Boston Join Date: Dec 2014
Posts: 10
|
![]()
AntonioRFranco - Thanks for the reply. Yes, free.
|
![]() |
![]() |
![]() |
#4 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,083
|
![]()
http://www.cbioportal.org/public-portal/
There are at least a couple of companies that offer different views of the TCGA data via web. Search on SeqAnswers. |
![]() |
![]() |
![]() |
#5 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,083
|
![]() |
![]() |
![]() |
![]() |
#6 |
Member
Location: Boston Join Date: Dec 2014
Posts: 10
|
![]()
GenoMax - Thanks for the reply. I probably should have mentioned the before: select ten (10) DNA sequence strings of length at least 1Mb related to a cancer gene from ten different individuals. Make sure the sequence data is in FASTQ format and stored in one file “DNA.fas”.
|
![]() |
![]() |
![]() |
#7 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,083
|
![]()
Now you say
![]() You are not going to find DNA sequences in fastq format that are 1Mb in length unless you assemble them yourself preserving the Q-scores (assembly programs do not include Q-scores in final sequence). |
![]() |
![]() |
![]() |
#8 |
Member
Location: Boston Join Date: Dec 2014
Posts: 10
|
![]()
I have a few days to work on this project. I'm lost.
|
![]() |
![]() |
![]() |
#9 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,083
|
![]()
Tell us more about what the entire project is about. DNA sequence is just a part of it?
|
![]() |
![]() |
![]() |
#10 |
Member
Location: Boston Join Date: Dec 2014
Posts: 10
|
![]()
GenoMax - Thanks again for your suggestions and assistance. I'm just a little reluctant to say more about this project right now. I hope you understand.
|
![]() |
![]() |
![]() |
#11 |
Member
Location: Boston Join Date: Dec 2014
Posts: 10
|
![]()
Though I don't feel I was doing anything unethical, others may disagree. Therefore, I deleted the complete requirements to avoid any conflict. Thank you.
Last edited by cambridge101; 12-19-2014 at 08:08 AM. Reason: Information not necessary. |
![]() |
![]() |
![]() |
#12 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,083
|
![]()
I think you should delete the description from the post above. This appears to be a class project/home work?
Perhaps you should ask whoever assigned the project if they are certain about the fastq requirement. |
![]() |
![]() |
![]() |
#13 |
Member
Location: Boston Join Date: Dec 2014
Posts: 10
|
![]()
Thanks for your suggestion. I don't think I'll delete it[Did decide to delete]. I don't feel that I'm doing anything unethical. I hope that it's clear that I'm not asking for anyone to complete the project for me. I'm just having a hard time finding that FASTQ data I need.
Last edited by cambridge101; 12-19-2014 at 08:10 AM. Reason: Inconsistant with edit to earlier post. |
![]() |
![]() |
![]() |
#14 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,083
|
![]()
One place to look for long sequences (in fastq format) will be here: http://www.pacificbiosciences.com/ne.../publications/ I see only a couple of cancer related publication and they are from 2012 (when the reads were not as long as they are today).
If you drop the cancer requirement then you will get some really long reads here: http://blog.pacificbiosciences.com/2...d-shotgun.html They are not going to be 1Mb so you would still need to do some assembly. |
![]() |
![]() |
![]() |
#15 | |
Member
Location: Boston Join Date: Dec 2014
Posts: 10
|
![]() Quote:
I'm open to other suggestions if anyone has any. |
|
![]() |
![]() |
![]() |
#16 |
Registered Vendor
Location: Eugene, OR Join Date: May 2013
Posts: 521
|
![]()
I only vaguely remember the details of the project from your deleted post, but if I were to guess what a reasonable assignment would be, it would be to select a cancer gene, then extract 1 Mb of sequence around the gene from ten different individual genomes, then analyze those 1 Mb regions for the various things asked for in the post.
You aren't going to find 1 Mb fastq reads, but you can find different individual genomes, or even different "cancer" genomes. You can definitely identify genes related to cancer. As others have said, I'd check back with the assigner of this project for clarification. edit: I teach an upper level course in genomic methods and analysis, so am definitely curious what this assignment is about!
__________________
Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com |
![]() |
![]() |
![]() |
#17 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,083
|
![]()
I like SNPsaurus' interpretation. 1 Mb (total amount of data or length of region covered) worth of fastq reads in and/or around a cancer gene makes sense.
@SNPsaurus: Main input for the assignment is still in post #6. The rest of the assignment was informatics goals. This will entail a significant amount of work (data collection part) and I hope the assignment has an appropriate amount of credit (unless it is a PhD qualifier exam). |
![]() |
![]() |
![]() |
#18 |
Member
Location: Boston Join Date: Dec 2014
Posts: 10
|
![]()
Alright... Let's say my oncogene of interest is in the region of 11:15000000..16000000. Therefore, all I need is that region from 10 different people.
Problem: Where do I find that data??? Any assistance is appreciated. |
![]() |
![]() |
![]() |
#19 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 7,083
|
![]()
Look for studies that have > 10 samples (since you need 10) or take 10 samples from different cancer types.
http://sra.dnanexus.com/?result_type...q=tumor+exome+ http://sra.dnanexus.com/?result_type...q=cancer+exome Last edited by GenoMax; 12-20-2014 at 02:07 PM. |
![]() |
![]() |
![]() |
#20 |
Senior Member
Location: bethesda Join Date: Feb 2009
Posts: 700
|
![]()
The comments here have links to sequences for PUBLIC human cancers ...
http://www.homolog.us/blogs/blog/201...able-from-bgi/ BGI liver cancer Seoul Genomic Medicine Institute lung cancer Changhai Hospital prostate cancer MD Andersen Asian Gastric cancer I think the data is in NCBI's SRA You'll need a lot of disk space and, if you're relatively new, a lot of patience. Sadly, a "bam slicer" that cuts out the reads for a region isn't available; though they say some folks are working on it. |
![]() |
![]() |
![]() |
Thread Tools | |
|
|