View Single Post
Old 05-12-2021, 06:10 AM   #682
Senior Member
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,131

Originally Posted by pck0 View Post
Hey, did a little test run mapping sequences against a reference fasta that contains two identical sequences called >one and >two, then repeated the process with the sequences renamed to >three and >four. The amount of reads mapping to each were as follows:

1st run:

>one: 47,699
>two: 330

2nd run:

>three: 47,688
>four: 338

BBmap options were all default, just minidentity = 90 and T=12

How come that (1) BBmap apparently misses some reads that map on the first sequence and then maps them on the second, identical sequence, and (2) how come the runs give different results?

Just curious, my apologies if this was addressed elsewhere!

Most NGS aligners are non-deterministic i.e. they will not produce exactly identical results if run multiple times.

Fortunately, BBMap does have an option to run in deterministic mode.
deterministic=f         Run in deterministic mode.  In this case it is good
                        to set averagepairdist.  BBMap is deterministic
                        without this flag if using single-ended reads,
                        or run singlethreaded.
You could also run the analysis using just a single thread.
GenoMax is offline   Reply With Quote