SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > 454 Pyrosequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
shotgun metagenomic sequencing coverage neokao Metagenomics 4 06-16-2015 02:54 PM
The software for metagenomic DNA shotgun sequencing mot Bioinformatics 3 08-13-2014 04:38 PM
Shotgun sequencing mido1951 Bioinformatics 3 07-24-2014 09:51 AM
new to bio: Can primer walking only be used in Shotgun sequencing? arkilis Bioinformatics 4 07-30-2013 12:32 AM
PubMed: Methods for generating shotgun and mixed shotgun/paired-end libraries for the Newsbot! Literature Watch 0 04-11-2009 06:40 AM

Reply
 
Thread Tools
Old 08-07-2015, 02:49 AM   #1
Heena Farooq
Junior Member
 
Location: J&K, INDIA

Join Date: Aug 2015
Posts: 7
Default shotgun sequencing dataset

Please send me the shotgun sequence dataset of 27 million sequencing reads in fasta format so that i can find out the genes from that sequence. thanks in advance!!! if possible, can anyone send me a link for downloading celera generated shotgun sequence.

Last edited by Heena Farooq; 08-07-2015 at 03:11 AM.
Heena Farooq is offline   Reply With Quote
Old 08-07-2015, 04:04 AM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,800
Default

Unless you provide some additional information you are not going to get meaningful help.

Why 27 million and why fasta format? You don't care about what organism the data is from?
GenoMax is offline   Reply With Quote
Old 08-07-2015, 08:07 AM   #3
Heena Farooq
Junior Member
 
Location: J&K, INDIA

Join Date: Aug 2015
Posts: 7
Default

Actually i have read the paper namely "Whole genome shotgun assembly and comparison of human genome assemblies". and i need to use its dataset which is generated from celera in 2001 called shotgun dataset which is of 27 million sequencing reads and i need to review this paper by calculating its accuracy of gene prediction. but i am not able to find its dataset's dna sequence so that i can find genes from that. Please help me out.
Heena Farooq is offline   Reply With Quote
Old 08-07-2015, 08:15 AM   #4
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,800
Default

I assume you are referring to this paper: http://www.pnas.org/content/101/7/1916.full

Quote:
Abbreviations: WGSA, whole-genome shotgun assembly; CSA, compartmental shotgun assembly; WGA, whole-genome assembly.

Data deposition: The sequences of the assemblies herein referred to as WGSA, CSA, and WGA have been deposited in the GenBank database (whole-genome assembly project accession nos. AADD00000000, AADC00000000, and AADB00000000).

If so the projects in that paper can be found here (these links seem to now point to the current versions of the human genome though):

http://www.ncbi.nlm.nih.gov/genome/?term=AADD00000000
http://www.ncbi.nlm.nih.gov/genome/?term=AADC00000000

Third one does not seem to be in GenBank (AADB00000000).

Last edited by GenoMax; 08-07-2015 at 08:18 AM.
GenoMax is offline   Reply With Quote
Old 08-07-2015, 10:12 AM   #5
Heena Farooq
Junior Member
 
Location: J&K, INDIA

Join Date: Aug 2015
Posts: 7
Default

yes, exactly...you are getting me right..that one is the paper.. i am new in this field hope u dnt mind. e.g. AADC00000000 contains a number of sequences means it has 169156 contigs. can you please tell me which sequence i need to use and how can i use it? thanks for your concern and please help
Heena Farooq is offline   Reply With Quote
Old 08-07-2015, 10:32 AM   #6
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,800
Default

It is not clear as to what you are trying to do.

A set of genes was predicted in early 2000's based on the first assembly of human genome but that number has been shrinking over the years (see: http://www.the-scientist.com/?articl...Shrinks-Again/).

Current human genome contains fairly well organized chromosome sequences which you can download from this link (in fasta format):ftp://ftp.ncbi.nlm.nih.gov/genomes/a...genomic.fna.gz

I am not sure where you see 169156 contigs? Earliest dataset I can see on UCSC is from May 2010: http://hgdownload.soe.ucsc.edu/golde...y2000/bigZips/
GenoMax is offline   Reply With Quote
Old 08-07-2015, 11:12 AM   #7
Heena Farooq
Junior Member
 
Location: J&K, INDIA

Join Date: Aug 2015
Posts: 7
Default

Ok let me explain you.. i have to find out the mutated genes.. and for that i have to take dataset..and thats y m reading the previous mentioned paper (1916.full).. m tryng to take dataset contains disease causing genes and dataset contains whole human genome.. aftr that i vl find out mutated genes frm thse two.. nw i gt ur dataset which you mentioned above..bt i m getting confusd abt its sequence..tht wt shud i use.. and u can also see 169156 contigs as locuslength of dataset AADC00000000.. above all this, if u dnt mind can you give me the direction for doing the same process which i want to do..hpe u undrstd wel nw..thanks
Heena Farooq is offline   Reply With Quote
Old 08-14-2015, 06:41 AM   #8
Heena Farooq
Junior Member
 
Location: J&K, INDIA

Join Date: Aug 2015
Posts: 7
Default

please send me the algorithm for GeneZilla gene finder algorithm...i will be thankful to you...
Heena Farooq is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:35 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO