SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Oases: De novo transcriptome assembly of very short reads lcollado De novo discovery 58 02-07-2017 08:48 AM
Inquiry: minimum length of reads for referece-based assembly or de novo assembly sunfuhui Bioinformatics 1 10-04-2013 09:28 AM
De novo transcriptome assembly and SNP study with redundant PE reads mcastro RNA Sequencing 1 10-22-2011 03:40 PM
De Novo assembly of a plant transcriptome raonyguimaraes RNA Sequencing 7 07-05-2011 01:17 PM
Which assembler for de-novo Illumina transcriptome assembly with relatively few reads kmkocot Bioinformatics 1 05-17-2011 03:13 AM

Reply
 
Thread Tools
Old 05-07-2013, 11:59 AM   #1
mruizm
Member
 
Location: Santiago

Join Date: Apr 2013
Posts: 22
Default Minimum amount of reads for de novo plant transcriptome assembly

I have Illumina short read, 2X150bp right now, around 12,3509 Gb data.
I just curious whether got any parameter or formula able to calculate the minimum short read required to assemble a transcript sequence by transcriptome assembler program in order to obtain comprehensive transcript?
eg. must have at least 1Mb Illumina short read in order to assemble it.

Do we need consider coverage and depth of data when determine or calculate the minimum short read required for transcriptome assembly as well?

Thank you!
mruizm is offline   Reply With Quote
Old 05-07-2013, 12:19 PM   #2
JackieBadger
Senior Member
 
Location: Halifax, Nova Scotia

Join Date: Mar 2009
Posts: 381
Default

Didn't you already post on this topic?

There are too many variables involved and missing info from your questions.
Are you saying you have 12,3509 Gb data right now?
Why don't you just assemble it?

" calculate the minimum short read required to assemble a transcript sequence by transcriptome assembler program in order to obtain comprehensive transcript?"

This obviously depends on the species. Are you sequencing total RNA or poly A/T targeted?
This makes a big difference!

How many genes are in your species?
What is the ploidy?
Do you expect many paralogues?
What depth do you want to sequence?
Do you have a reference?
How many individuals do you want to sequence?

If I was de-novo sequencing a species with no prior information I would use longer reads of 454, and then fill in the gaps and depth with Illumina.

I don't understand how you expect to calculate the number of sequences you need to get a "good representation" of the transcriptome when you have no expectations ??


There are ways to estimate genome size, this may give you some idea of how many genes to expect..

Why do you want to examine the transcriptome? What are your specific questions?
JackieBadger is offline   Reply With Quote
Old 05-08-2013, 12:56 PM   #3
mruizm
Member
 
Location: Santiago

Join Date: Apr 2013
Posts: 22
Default

Ok, first what i'm studying is the denovo transcriptome of Aristotellia chilensis, there is no reference here, so for that i have five sequencing results of "MiSeq" from Illumina, each of the sequencing whas performed right this:
1 sequencing: half-ripened and mature tissues of Aristotellia chilensis
2 sequencing: half-ripened and mature
3 sequencing: green, albino and leaf
4 sequencing: green, half-ripened, mature and albino
5 sequencing: green, half-ripened, mature and albino

So, the total amount of information of all these sequencing data is 12,3509 Gb.
What i'm trying to know is the total number of reads required to generate a good assembly for transcriptomic plant information!
mruizm is offline   Reply With Quote
Old 09-09-2013, 01:54 AM   #4
Blahah404
Member
 
Location: Cambridge, UK

Join Date: Dec 2011
Posts: 48
Default

It's best not to post multiple threads asking the same question.

There is no specific number of reads that is enough. It depends on the structure and repetitiveness of the genome of your species, and many other factors you can't necessarily measure. The best thing is just to try assembling it.

For what it's worth, I've had excellent assemblies from 4GB of paired-end 100bp Illumina reads, and I've had terrible assemblies from 400GB of similar reads from a different species with a higher ploidy genome. The number of reads tells you nothing about how good the assembly will be.
Blahah404 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:55 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO