SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
hisat2 file reads don't match up ronaldrcutler RNA Sequencing 19 06-21-2016 11:04 AM
Bowtie1 for ChIP-seq: uniquely mapped reads with up to two mismatches Jerry_Zhao Bioinformatics 0 01-21-2014 07:28 AM
How to randomly remove portions of the raw reads from the FASTQ file choijae3 Bioinformatics 5 01-08-2014 07:27 AM
how to randomly select 20m reads out of a FASTQ file angerusso RNA Sequencing 9 08-15-2013 11:26 AM
effect to variation detection if the reads are not distributed uniformly yuhao Bioinformatics 1 08-13-2012 03:41 PM

Reply
 
Thread Tools
Old 06-12-2019, 03:59 AM   #1
Mammoth
Junior Member
 
Location: Netherlands

Join Date: Jun 2019
Posts: 2
Default Bowtie1: Reads don't get randomly distributed on direct repeat

Hi all,

I am mapping small RNA sequencing reads with bowtie1.2.2 to a locus that encodes a direct repeat with the exactly(!) the same sequence repeated 19 times. I don't allow mismatches, and ask to get back only one alignment:

bowtie index clippedreads.fq --best --strata -M 1 -v 0 -S | \
samtools view -Sb -F 4 - |\
samtools sort - -o outfile.bam

(or alternatively --best -k 1 -v 0; however, it does not really make a difference in my hands)

A couple of thousand reads are mapping to that locus (all these reads have the same sequence of course), and based on the bowtie1 manual I would assume they are randomly (and roughly equally) distributed across the 19 repeat units. However, what I get is one or two of the repeat units are highly covered, and the rest is distributed as expected (see picture):
repealocus.png

Weirdly, the peak shifts with different libraries, but is otherwise nicely reproducible.
I was wondering what is going on, why the repeat units are not equally covered, or how this can be handled?

Thanks a ton! Any suggestion is highly appreciated.
Mammoth is offline   Reply With Quote
Old 06-12-2019, 11:43 AM   #2
pmiguel
Senior Member
 
Location: Purdue University, West Lafayette, Indiana

Join Date: Aug 2008
Posts: 2,315
Default

Looks more-or-less random to me. How many standard deviations from the mean number of hits/repeat are the high and low-hit repeat counts?

--
Phillip
pmiguel is offline   Reply With Quote
Old 06-13-2019, 09:55 PM   #3
Mammoth
Junior Member
 
Location: Netherlands

Join Date: Jun 2019
Posts: 2
Default

Dear Phillip,

Thank you for your reply. The biggest peak is around 4 standard deviations larger than the mean, the lowest one 2 SD. That seems quite a bit for a random distribution, or not?
Mammoth is offline   Reply With Quote
Reply

Tags
bowtie 1.2, repeat

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:28 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO