How many contigs one can get after metagenome assembly?
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Originally posted by BIOin View Posti want to assemble 25 million reads. i am getting varying results with different assemblers.
For a metagenome, the complexity can vary depending on your sample. If you had a very complex sample, 25M reads (platform? paired end? read length?) is probably barely scratching the surface -- 25M 2x100 Illumina reads is only 5Gb, which isn't gigantic if you have a diverse sample.
Comment
-
thanks for the reply.
yes my data is complex(animal rumen), my data set Illumina 25M HiSeq 2000 2x100,
I just started using meta-velvet to assemble high quality metagenome data. I tried running meta-velvet with a k-mer of 45, after the assembly is finished and I look at the output file "meta-velvetg.contigs.fa" got 1128469 contigs with max contig length 31758 bp and N50 190.
Should i have to consider this assembly or need to run more Kmers...
Please give me suggestions on assemblers to be use
Comment
-
I tried to assembly a metagenome (plant endophyte, the plant genome is not avaiable now) uing ILLUMINA hiseq 2000 2*100 reads too, my data has 69 M paired end reads, 9.9 Billion bases. I assemblied these reads using CLC genomic workbench, and got 770 thousands contigs. I am working on these contigs now. How do you deal with your so many contigs? Could we share our idears>
Originally posted by BIOin View Postthanks for the reply.
yes my data is complex(animal rumen), my data set Illumina 25M HiSeq 2000 2x100,
I just started using meta-velvet to assemble high quality metagenome data. I tried running meta-velvet with a k-mer of 45, after the assembly is finished and I look at the output file "meta-velvetg.contigs.fa" got 1128469 contigs with max contig length 31758 bp and N50 190.
Should i have to consider this assembly or need to run more Kmers...
Please give me suggestions on assemblers to be use
Originally posted by BIOin View Postthanks for the reply.
yes my data is complex(animal rumen), my data set Illumina 25M HiSeq 2000 2x100,
I just started using meta-velvet to assemble high quality metagenome data. I tried running meta-velvet with a k-mer of 45, after the assembly is finished and I look at the output file "meta-velvetg.contigs.fa" got 1128469 contigs with max contig length 31758 bp and N50 190.
Should i have to consider this assembly or need to run more Kmers...
Please give me suggestions on assemblers to be use
Comment
-
For the assembly of paired-end only Illumina data, I like to use ABySS assembler. But if the metagenome is too complicated, I agree with the previous post that both 25 M and 69 M reads are just to scratch the surface. Using different assemblers won't make signficant difference in terms of the number of contigs or n50.
If the purpose is just to recover genes from the metagenome, paired-end only Illumina data is useful to uncover genes except for those that suffer from strain variations. But to increase the integraty of the assembly dramatically (increase n50), mate-pair data with long inserts can significantly increase scaffolding performance. With some programs to resolve some gaps within scaffolds, the assembly can be improved further.
Comment
-
Originally posted by Shuiquan View PostFor the assembly of paired-end only Illumina data, I like to use ABySS assembler. But if the metagenome is too complicated, I agree with the previous post that both 25 M and 69 M reads are just to scratch the surface. Using different assemblers won't make signficant difference in terms of the number of contigs or n50.
If the purpose is just to recover genes from the metagenome, paired-end only Illumina data is useful to uncover genes except for those that suffer from strain variations. But to increase the integraty of the assembly dramatically (increase n50), mate-pair data with long inserts can significantly increase scaffolding performance. With some programs to resolve some gaps within scaffolds, the assembly can be improved further.
Comment
Latest Articles
Collapse
-
by seqadmin
Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...-
Channel: Articles
04-04-2024, 04:25 PM -
-
by seqadmin
Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...-
Channel: Articles
03-22-2024, 06:39 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 04-11-2024, 12:08 PM
|
0 responses
30 views
0 likes
|
Last Post
by seqadmin
04-11-2024, 12:08 PM
|
||
Started by seqadmin, 04-10-2024, 10:19 PM
|
0 responses
32 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 10:19 PM
|
||
Started by seqadmin, 04-10-2024, 09:21 AM
|
0 responses
28 views
0 likes
|
Last Post
by seqadmin
04-10-2024, 09:21 AM
|
||
Started by seqadmin, 04-04-2024, 09:00 AM
|
0 responses
52 views
0 likes
|
Last Post
by seqadmin
04-04-2024, 09:00 AM
|
Comment