Unconfigured Ad

**SDPA_Pet** · 03-08-2017, 07:27 AM

Originally posted by Markiyan View Post

Clustering means bridge amplification for pre ExAmp (non-patterned flowcells) - in situ PCR on the flow cell surface oligos lawn. Has similar rukes/laws to a regular PCR, only the product stays in situ, forming a forest from DNA strands.

For ExAmp Chemistry (patterned flowcells) - Clustering means cluster formation using Isothermal Amplification.
(In theory only on the occupied nanowell, in practice, especially at low loading concentrations a few neighbours may join in too...).

Have a read about ExAmp & Hiseq4000:
http://core-genomics.blogspot.co.uk/...d-to-know.html

Thank you. I did my sequencing on old platform HiSeq 2500.

**fanli** · 03-08-2017, 08:33 AM

Out of curiosity, why are you joining the read pairs? A lot of the metagenomics software out there now supports paired end reads as input. The metaSPAdes assembler @GenoMax mentioned requires paired end data IIRC.

**SDPA_Pet** · 03-08-2017, 08:37 AM

Originally posted by fanli View Post

Out of curiosity, why are you joining the read pairs? A lot of the metagenomics software out there now supports paired end reads as input. The metaSPAdes assembler @GenoMax mentioned requires paired end data IIRC.

Hey, I did try metaSPAdes, less than 1% of total reads assembled. A lot of people tried alternative methods, joined paired-ends and get long reads, but don't assembled reads. Then, use the long merged reads to do BLAST or other annotations.

**fanli** · 03-08-2017, 08:39 AM

Would something like kraken or CLARK not be helpful? Are you trying to assemble and annotate de novo genomes? Or trying to figure out the microbial composition and functional content? I guess my point is you would discard ~40% of your data in the joining process, which may not be necessary depending on your task of interest.

**SDPA_Pet** · 03-08-2017, 08:43 AM

Originally posted by fanli View Post

Would something like kraken or CLARK not be helpful? Are you trying to assemble and annotate de novo genomes? Or trying to figure out the microbial composition and functional content? I guess my point is you would discard ~40% of your data in the joining process, which may not be necessary depending on your task of interest.

I am not interested in a specific genome in the soil community. Basically, I am just interested in the microbial composition and functional content. There is another way that people usually do. They don't assemble or merge pairs and they just blast using raw data. However, I think blast using ~150bp reads is worse. Some publication shows the intermediate length (merged pair) is better than assembled longer reads or unasembled short reads for the question that I am asking for.

**fanli** · 03-08-2017, 08:49 AM

You might find this benchmark to be helpful:

303 See Other

http://www.nature.com/articles/srep19233

My understanding is that you are better off using newer k-mer based approaches as opposed to BLAST. I've had reasonable success with kraken, although the memory requirements are somewhat onerous. You also have to be extremely careful about removing contaminant (aka human) sequences as these tend to get misclassified.

Another option is kallisto (https://github.com/pachterlab/metakallisto) but I have yet to be able to even build a database due to memory constraints.

**SDPA_Pet** · 03-08-2017, 08:51 AM

Originally posted by fanli View Post

You might find this benchmark to be helpful:

303 See Other

http://www.nature.com/articles/srep19233

My understanding is that you are better off using newer k-mer based approaches as opposed to BLAST. I've had reasonable success with kraken, although the memory requirements are somewhat onerous. You also have to be extremely careful about removing contaminant (aka human) sequences as these tend to get misclassified.

Another option is kallisto (https://github.com/pachterlab/metakallisto) but I have yet to be able to even build a database due to memory constraints.

I will try. Can you gimme the Kraken link? Thanks.

Topics	Statistics	Last Post
High-Resolution Sequencing Exposes Hidden Toxoplasma Diversity by SEQadmin2 Started by SEQadmin2, Yesterday, 11:08 AM	0 responses 7 views 0 reactions	Last Post by SEQadmin2 Yesterday, 11:08 AM
New AI Model Captures Long-Range Genomic Signals to Improve RNA Splice Site Prediction by SEQadmin2 Started by SEQadmin2, 06-30-2026, 05:37 AM	0 responses 11 views 0 reactions	Last Post by SEQadmin2 06-30-2026, 05:37 AM
Large-Scale Protein Screen Uncovers Hidden Regulators of Alternative Polyadenylation by SEQadmin2 Started by SEQadmin2, 06-26-2026, 11:10 AM	0 responses 20 views 0 reactions	Last Post by SEQadmin2 06-26-2026, 11:10 AM
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 54 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM

Unconfigured Ad

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News