There's a bunch of hybrid de novo assembly threads but not sure I'm finding the answers to these questions, specifically about paired-end reads from two platforms (Illumina and 454).
454's 'paired end' are really mate pair ends from a 3-20kb fragment, swapped right to left in the read, both ends remain in 5--3' orientation (thanks to circularization), and are separated by a linker sequence.
5' right>-----[linker]left>------- '3
During assembly, Newbler splits these reads and also does quality trimming. It 'knows' that X bp should occur between the two half-reads.
Illumina paired ends are separate reads from each end of a ~350--600 bp fragment
read 1
5' --~100nt of left - 3'
read 2
5' --~100nt of reverse complement of right -- 3'
These reads may require some adapter and quality trimming as well.
So, can Newbler (I have v3.0) 'understand' Illumina paired-end reads (i.e. know that they represent of span of X bp)? Can it do trimming of adapters and low-quality bases?
Alternately, are there assemblers that 'understand' Illumina paired ends, but can also understand 454 PE reads? That is , tjhey know that the read must be split, linker discarded, and spacing of X bp between them? Can any also do quality trimming of SFF files?
I have raw fastq and sff reads, as well as trimmed versions of the same fq and sff reads. I also have (raw or trimmed) single-end sff and Fastq sets to use in the assembly (most of the data are single-end). Wanting to know what assemblers need what input, to fully exploit hybrid paired end information (e.g. for making better scaffolds)... with the least preprocessing on my part.
454's 'paired end' are really mate pair ends from a 3-20kb fragment, swapped right to left in the read, both ends remain in 5--3' orientation (thanks to circularization), and are separated by a linker sequence.
5' right>-----[linker]left>------- '3
During assembly, Newbler splits these reads and also does quality trimming. It 'knows' that X bp should occur between the two half-reads.
Illumina paired ends are separate reads from each end of a ~350--600 bp fragment
read 1
5' --~100nt of left - 3'
read 2
5' --~100nt of reverse complement of right -- 3'
These reads may require some adapter and quality trimming as well.
So, can Newbler (I have v3.0) 'understand' Illumina paired-end reads (i.e. know that they represent of span of X bp)? Can it do trimming of adapters and low-quality bases?
Alternately, are there assemblers that 'understand' Illumina paired ends, but can also understand 454 PE reads? That is , tjhey know that the read must be split, linker discarded, and spacing of X bp between them? Can any also do quality trimming of SFF files?
I have raw fastq and sff reads, as well as trimmed versions of the same fq and sff reads. I also have (raw or trimmed) single-end sff and Fastq sets to use in the assembly (most of the data are single-end). Wanting to know what assemblers need what input, to fully exploit hybrid paired end information (e.g. for making better scaffolds)... with the least preprocessing on my part.
Comment