View Single Post
Old 05-05-2015, 10:58 AM   #197
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

It sounds like you don't have any "unsequenced part" in this library. If the insert size is shorter than read length, that means that the the library was fragmented into pieces that were too small, and the sequencer reads off the end of the genomic sequence into the adapter sequence. Before using those reads, you should do adapter trimming with (for example) BBDuk.

As for the minimum size being 1bp, that is probably a read pair being mis-mapped. You can get a second opinion on the insert size distribution with BBMerge, which calculates it by looking for overlaps instead of by mapping, but I expect the result to be the same in this case. BBMap is more accurate for insert size calculations with long inserts (which don't overlap), while BBMerge is more accurate for insert sizes shorter than read length because the adapter sequence interferes with mapping.
Brian Bushnell is offline   Reply With Quote