Hedi86 01-30-2020 06:47 AM

length of uniquely mapped seq in Bismark
Dear all

Bismark reports the number of allignments with a unique best hit, is there a way to know what was the Length of those unique sequences ? for instance I have an RRBS library which after trimming has seq length of 20 - 140 bp, this gave me about 40% unique mapping in Bismark, I would like to know the length of those fragments which mapped uniquely.

Best regards

fkrueger 01-31-2020 01:54 AM

Hi Hedi,

Every alignment that is reported by Bismark are unique hits (ambiguosly aligning reads are discarded, and not-aligning reads - well - don't align at all). So you can simply look at the aligned read length distribution using your program of choice. It would only be 2 clicks in SeqMonk for example. As a general rule though, longer reads tend to align better.

40% mapping efficiency doesn't sound great to be honest. Which organisma are you working with, and did you perform appropriate adapter/quality trimming etc?

Hedi86 01-31-2020 04:57 AM

thank you Felix

so i just imported the BAM files generated by Bismark into Seqmonk, without any quantification i can click on read length distribution histogram on data sets. but trimmed files should have 20 - 140 bp length, why still i can see seq with for example 400 bp in BAM files?

we are working with bull data, and yes we did pay attention to evrything when it comes to trimming and quality assessment. but still this is what we have unfortunatelly.

Appreciate your halp

fkrueger 01-31-2020 05:00 AM

That sounds like you are working with paired-end data. In that case the Read 1 and Read 2 will be ends of the fragment you are sequencing, and they may well be further apart. In this case it doesn't really matter how long each read is, but it is the combination of the mappability of R1 and R2 that counts.

Hedi86 01-31-2020 05:26 AM

thank you

we actually used two different kits for RRBS library preparation, I figured out that there is lots of short reads under 70 bp in the first method and using another method the peak of read length was about 130 bp which we would like to have, as you mentioned longer reads tend to align better. but surprisingly mapping was almost the same, that's why i was curious to look at the length of those aligned reads in both libraries.


