SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   Bowtie1: Reads don't get randomly distributed on direct repeat (http://seqanswers.com/forums/showthread.php?t=89738)

Mammoth 06-12-2019 04:59 AM

Bowtie1: Reads don't get randomly distributed on direct repeat
 
1 Attachment(s)
Hi all,

I am mapping small RNA sequencing reads with bowtie1.2.2 to a locus that encodes a direct repeat with the exactly(!) the same sequence repeated 19 times. I don't allow mismatches, and ask to get back only one alignment:

bowtie index clippedreads.fq --best --strata -M 1 -v 0 -S | \
samtools view -Sb -F 4 - |\
samtools sort - -o outfile.bam

(or alternatively --best -k 1 -v 0; however, it does not really make a difference in my hands)

A couple of thousand reads are mapping to that locus (all these reads have the same sequence of course), and based on the bowtie1 manual I would assume they are randomly (and roughly equally) distributed across the 19 repeat units. However, what I get is one or two of the repeat units are highly covered, and the rest is distributed as expected (see picture):
Attachment 5336

Weirdly, the peak shifts with different libraries, but is otherwise nicely reproducible.
I was wondering what is going on, why the repeat units are not equally covered, or how this can be handled?

Thanks a ton! Any suggestion is highly appreciated.

pmiguel 06-12-2019 12:43 PM

Looks more-or-less random to me. How many standard deviations from the mean number of hits/repeat are the high and low-hit repeat counts?

--
Phillip

Mammoth 06-13-2019 10:55 PM

Dear Phillip,

Thank you for your reply. The biggest peak is around 4 standard deviations larger than the mean, the lowest one 2 SD. That seems quite a bit for a random distribution, or not?


All times are GMT -8. The time now is 12:40 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.