Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
hisat2 file reads don't match up ronaldrcutler RNA Sequencing 19 06-21-2016 12:04 PM
Bowtie1 for ChIP-seq: uniquely mapped reads with up to two mismatches Jerry_Zhao Bioinformatics 0 01-21-2014 08:28 AM
How to randomly remove portions of the raw reads from the FASTQ file choijae3 Bioinformatics 5 01-08-2014 08:27 AM
how to randomly select 20m reads out of a FASTQ file angerusso RNA Sequencing 9 08-15-2013 12:26 PM
effect to variation detection if the reads are not distributed uniformly yuhao Bioinformatics 1 08-13-2012 04:41 PM

Thread Tools
Old 06-12-2019, 04:59 AM   #1
Junior Member
Location: Netherlands

Join Date: Jun 2019
Posts: 2
Default Bowtie1: Reads don't get randomly distributed on direct repeat

Hi all,

I am mapping small RNA sequencing reads with bowtie1.2.2 to a locus that encodes a direct repeat with the exactly(!) the same sequence repeated 19 times. I don't allow mismatches, and ask to get back only one alignment:

bowtie index clippedreads.fq --best --strata -M 1 -v 0 -S | \
samtools view -Sb -F 4 - |\
samtools sort - -o outfile.bam

(or alternatively --best -k 1 -v 0; however, it does not really make a difference in my hands)

A couple of thousand reads are mapping to that locus (all these reads have the same sequence of course), and based on the bowtie1 manual I would assume they are randomly (and roughly equally) distributed across the 19 repeat units. However, what I get is one or two of the repeat units are highly covered, and the rest is distributed as expected (see picture):

Weirdly, the peak shifts with different libraries, but is otherwise nicely reproducible.
I was wondering what is going on, why the repeat units are not equally covered, or how this can be handled?

Thanks a ton! Any suggestion is highly appreciated.
Mammoth is offline   Reply With Quote
Old 06-12-2019, 12:43 PM   #2
Senior Member
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,318

Looks more-or-less random to me. How many standard deviations from the mean number of hits/repeat are the high and low-hit repeat counts?

pmiguel is offline   Reply With Quote
Old 06-13-2019, 10:55 PM   #3
Junior Member
Location: Netherlands

Join Date: Jun 2019
Posts: 2

Dear Phillip,

Thank you for your reply. The biggest peak is around 4 standard deviations larger than the mean, the lowest one 2 SD. That seems quite a bit for a random distribution, or not?
Mammoth is offline   Reply With Quote

bowtie 1.2, repeat

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 07:23 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2022, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO