Seqanswers Leaderboard Ad

**Brian Bushnell** · 11-20-2014, 10:09 AM

Oh, sorry, the command I gave you was for interleaved reads, so those are all false positive merges. For pairs in separate files, the command would be:

bbmerge.sh in1=R1.fq in2=R2.fastq ihist=ihist.txt reads=100000

As for whether you merge the reads before assembly, that depends on the assembler. AllPathsLG does its own merging; Ray seems to do well with merged reads; SoapDenovo creates worse assemblies with merged reads; there's no single rule. I've never used Geneious, so I don't know if it would help. As with many options, such as trimming and subsampling, sometimes the only way to get the best assembly is to try both ways.

If you do merge the reads, I suggest using the default settings:

bbmerge.sh in1=r1.fq in2=r2.fq out=merged.fq outu1=unmerged1.fq outu2=unmerged2.fq

...then feed the assembler both the merged and unmerged reads. Many or most assemblers will accept both paired and unpaired reads; merging should not be done for assemblers that don't allow you to feed them both paired and unpaired reads simultaneously, as low-complexity genomic regions will not merge as well.

**Marisa_Miller** · 11-20-2014, 12:09 PM

Originally posted by Brian Bushnell View Post

The fraction joined and the position of the peak in the graph will make it clear what the real distribution is like. If the graph is still rising then abruptly drops to zero just before 2x(read length) then the insert sizes are generally too long for merging.

Hi Brian,
I re-ran bbmerge with the correct command, and it looks like the distribution shows the insert sizes are too big for merging. Although, the fraction joined is around 60% for most libraries, not sure if this is good or bad.

Attached Files

ihist_1.txt (4.1 KB, 6 views)

**Brian Bushnell** · 11-20-2014, 12:24 PM

For a 2x300bp library, getting 60% merging and a median of 400bp is pretty optimal for merging, actually. Again, whether merging is a good idea depends on the assembler, but this library is a good. You never see 90%+ merging unless the insert sizes came out way too short.

Attached Files

ihist.png (11.3 KB, 159 views)

**Marisa_Miller** · 11-20-2014, 12:27 PM

Originally posted by Brian Bushnell View Post

For a 2x300bp library, getting 60% merging and a median of 400bp is pretty optimal for merging, actually. Again, whether merging is a good idea depends on the assembler, but this library is a good. You never see 90%+ merging unless the insert sizes came out way too short.

I think I misunderstood earlier about what to look for in a library to see if it can be merged. I will go ahead and merge them and give the assembly a shot with both merged and unmerged. Thanks again!

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 30 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 32 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News