Seqanswers Leaderboard Ad

**greigite** · 09-09-2010, 09:22 AM

If you decide to assemble your reads take a look at this recent paper presenting an open source pipeline:

Rodrigue S, Materna AC, Timberlake SC, Blackburn MC, Malmstrom RR, et al. (2010) Unlocking Short Read Sequencing for Metagenomics. PLoS ONE 5: e11840. Available: http://dx.plos.org/10.1371/journal.pone.0011840.

Abstract
Background Different high-throughput nucleic acid sequencing platforms are currently available but a trade-off currently exists between the cost and number of reads that can be generated versus the read length that can be achieved.
Methodology/PrincipalFindings:We describe an experimental and computational pipeline yielding millions of reads that can exceed 200 bp with quality scores approaching that of traditional Sanger sequencing. The method combines an automatable gel-less library construction step with paired-end sequencing on a short-read instrument. With appropriately sized library inserts, mate-pair sequences can overlap, and we describe the SHERA software package that joins them to form a longer composite read.
Conclusions/Significance:This strategy is broadly applicable to sequencing applications that benefit from low-cost high- throughput sequencing, but require longer read lengths. We demonstrate that our approach enables metagenomic analyses using the Illumina Genome Analyzer, with low error rates, and at a fraction of the cost of pyrosequencing.

**FredOnSeq** · 09-09-2010, 10:54 AM

Originally posted by greigite View Post

If you decide to assemble your reads take a look at this recent paper presenting an open source pipeline:

Rodrigue S, Materna AC, Timberlake SC, Blackburn MC, Malmstrom RR, et al. (2010) Unlocking Short Read Sequencing for Metagenomics. PLoS ONE 5: e11840. Available: http://dx.plos.org/10.1371/journal.pone.0011840.

Thank you very much for your answser...
Today, I tried the program stitch to assemble my paired-end reads... and it worked fine. Nevertheless, I will test SHERA tomorrow on my data and compare the performance.
Fred

**jmartin127** · 01-05-2011, 02:37 PM

FredOnSeq, would you mind pointing me to where you found the "stitch" software? I can't seem to find it by Google'ing. Thanks in advance.

**jstjohn** · 03-12-2011, 10:41 PM

Stitch is available here:

GitHub - audy/stitch: Overlap assembler of paired-end DNA sequences generated by Illumina

http://github.com/audy/stitch

Overlap assembler of paired-end DNA sequences generated by Illumina - audy/stitch

On the other hand, I have a working draft C program to do adapter stripping/ paired-end merging similar to stitch and SHERA for large potentially gzipped datasets. It looks like it processes somewhere around 20M 100x2 pairs per hour in my testing. Its available here if anyone is interested:

GitHub - jstjohn/SeqPrep: Tool for stripping adaptors and/or merging paired reads with overlap into single reads.

https://github.com/jstjohn/SeqPrep

Tool for stripping adaptors and/or merging paired reads with overlap into single reads. - jstjohn/SeqPrep

I don't have correctness statistics available, but the program can copy a subset of the merged reads into a human-readable aligned format so you can sanity check the settings. The defaults seem to work well with my data.

**greigite** · 03-23-2011, 01:30 PM

I have had some trouble installing Stitch- if anyone has successfully run it, could you point me in the right direction? The error is as follows:
> python setup.py install
Traceback (most recent call last):
File "setup.py", line 9, in <module>
setup(
NameError: name 'setup' is not defined

I'm also having some trouble with SeqPrep. Is there any reason why the merged file produced with option 's' should look like a binary file (it dumps a bunch of garbage onto the screen when I open it on the command line)?

**jstjohn** · 04-18-2011, 05:19 PM

For now SeqPrep outputs gziped files regardless of the name you give the output. If its just the phred scores that look weird that could be because you have ascii Phred+64 and didn't supply -6 as a command line argument.

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, Yesterday, 11:49 AM	0 responses 15 views 0 likes	Last Post by seqadmin Yesterday, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 62 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

Questions about overlapping paired-end reads...

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News