![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
"allele balance ratio" and "quality by depth" in VCF files | efoss | Bioinformatics | 2 | 10-25-2011 12:13 PM |
Relatively large proportion of "LOWDATA", "FAIL" of FPKM_status running cufflink | ruben6um | Bioinformatics | 3 | 10-12-2011 01:39 AM |
The position file formats ".clocs" and "_pos.txt"? Ist there any difference? | elgor | Illumina/Solexa | 0 | 06-27-2011 08:55 AM |
"Systems biology and administration" & "Genome generation: no engineering allowed" | seb567 | Bioinformatics | 0 | 05-25-2010 01:19 PM |
SEQanswers second "publication": "How to map billions of short reads onto genomes" | ECO | Literature Watch | 0 | 06-30-2009 12:49 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Junior Member
Location: Hungary Join Date: Mar 2011
Posts: 7
|
![]()
Hi all!
I would really like to try a certain program (http://www.cbcb.umd.edu/software/BRCA-diagnostic/), because in some ways it's similar to a project I'm working on. My problem is, that I just can't find the right data to try this program on. It needs the genome of an individual (so not the reference genome) in raw short read data format. I'm not really familiar with theese things, so I would really appreciate, if someone could tell me, where to find appropriate DNA sequences, that fit into this category. The program uses the bowtie short read alignment program, so with other words, I need short reads of a human genome, that can be aligned by bowtie, I guess. Yours, Attila Last edited by attilav; 08-18-2011 at 05:13 AM. |
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: Boston area Join Date: Nov 2007
Posts: 747
|
![]()
You can find tons of datasets in the NCBI Short Read Archive and the European Nucleotide Archive.
You can also find already aligned short read datasets (in BAM format) all over the place. The 1000 genome project is sequencing many at low coverage, so might not be good for you, but Watson's genome is available as well as many others (Venter's is available in long reads). Complete Genomics has made a large number of human genomes available on their website. Personal Genome Project should have files up as well. (aside: perhaps there should be a wiki section on repositories of human and other genome alignments) |
![]() |
![]() |
![]() |
#3 |
Junior Member
Location: San Francisco, CA Join Date: Jun 2011
Posts: 9
|
![]()
Unfortunately Bowtie is not optimized for Complete Genomics data. Specifically, Complete Genomics reads have sub-read gaps that Bowtie will interpret as mismatches. The high mismatch frequency will prevent Bowtie from successfully aligning many reads to the reference.
If you are interested in working with SNP genotypes and Complete Genomics data, we strongly recommend using the Complete Genomics-developed snpdiff command in our open source CGA Tools package (http://cgatools.sourceforge.net/). This tool is specifically designed to extract SNP genotypes from Complete Genomics data, and to compare Complete Genomics genotype calls with SNP genotypes generated on other platforms.
__________________
Shaun Cordes, PhD | Customer Support Scientist | Complete Genomics, Inc. Toll-free: (855) 267-5358 | Direct: (650) 943-2651 scordes@completegenomics.com |
![]() |
![]() |
![]() |
Thread Tools | |
|
|