SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Can Cuffdiff treat paired-end and single-end reads at the same time? zun RNA Sequencing 3 06-12-2012 06:37 PM
50 bp Single end Vs 100 bp Single end Vs 50 bp Paired end dhanapala RNA Sequencing 4 06-08-2012 06:09 PM
Can paired-end mapping produce more reads than single-end ? warrenemmett Bioinformatics 13 03-21-2012 12:10 AM
RNA-seq: Replicates, single-end, paired-end story pasta Bioinformatics 2 07-05-2011 12:51 AM
Does Cufflinks support single-end and paired end data together ? ersenkavak Bioinformatics 1 10-22-2010 08:26 AM

Reply
 
Thread Tools
Old 10-03-2012, 12:16 PM   #1
north_zeb
Junior Member
 
Location: Ireland

Join Date: Sep 2012
Posts: 9
Question bowtie paired-end versus single-end

Hi guys,

I'm totally new to NGS and have 2 fastq files corresponding to a paired-end illumina chipseq experiment. I am using bowtie to align the fastq : first i align only one fastq file then both of them. The % of reads with at least 1 reported alignment is so different for each case: 83% when i used only 1 fastq compared to ONLY 47% when i used both of them (as i should have). Anyone can explain me why ?

here are the results:


/bowtie -t -p 4 --sam --chunkmbs 1000 hg19/hg19 reads/V400can_chipseq_1.fastq > results/v400.sam

# reads processed: 15749589
# reads with at least one reported alignment: 13044920 (82.83%)
# reads that failed to align: 2704669 (17.17%)
Reported 13044920 alignments to 1 output stream(s)
Time searching: 00:32:23

AS OPPOSED TO:

./bowtie -t -p 4 --sam --chunkmbs 1000 -m 1 hg19/hg19 -1 reads/V400can_chipseq_1.fastq -2 reads/V400can_chipseq_2.fastq > results/v400_paired.sam

# reads processed: 15749589
# reads with at least one reported alignment: 7813517 (49.61%)
# reads that failed to align: 7480588 (47.50%)
# reads with alignments suppressed due to -m: 455484 (2.89%)
Reported 7813517 paired-end alignments to 1 output stream(s)
Time searching: 01:44:03


Thanks a mil in advance,

NZ
north_zeb is offline   Reply With Quote
Old 10-03-2012, 01:05 PM   #2
swbarnes2
Senior Member
 
Location: San Diego

Join Date: May 2008
Posts: 912
Default

I don't use Bowtie much, but there's a setting for expected insert size, and I think Bowtie behaves very badly with pairs that are too far from that insert size. Crank up the maximum insert size, and try again
swbarnes2 is offline   Reply With Quote
Old 10-03-2012, 01:11 PM   #3
biznatch
Senior Member
 
Location: Canada

Join Date: Nov 2010
Posts: 124
Default

Quote:
Originally Posted by swbarnes2 View Post
I don't use Bowtie much, but there's a setting for expected insert size, and I think Bowtie behaves very badly with pairs that are too far from that insert size. Crank up the maximum insert size, and try again
Max insert size -X, default is only 250 (ie. total size including both reads + insert has to be 250 or less). I set mine at 1000 so nothing should be excluded.
biznatch is offline   Reply With Quote
Old 10-03-2012, 02:31 PM   #4
swbarnes2
Senior Member
 
Location: San Diego

Join Date: May 2008
Posts: 912
Default

Did you try inspecting reads that mapped the first time, and not the second?

Again, I don't use bowtie, but the first time you used 1 fastq, and the second time you used 2, is it normal for the # of total reads to be the same?

Also, rather than

Quote:
> results/v400_paired.sam
consider

Quote:
| samtools view -bSh - > v400_paired.bam
You can convert a subset of that file to .sam later to eyeball things.
swbarnes2 is offline   Reply With Quote
Old 10-04-2012, 07:14 AM   #5
north_zeb
Junior Member
 
Location: Ireland

Join Date: Sep 2012
Posts: 9
Default

Thanks for all replies !
i will have a look at the insert size story. True, the total nb of reads is given the same in both cases, and this is the output of bowtie. Can it be that bowtie counts each pair when reports on paired-end reads ?
north_zeb is offline   Reply With Quote
Old 10-09-2012, 12:55 PM   #6
JackieBadger
Senior Member
 
Location: Halifax, Nova Scotia

Join Date: Mar 2009
Posts: 381
Default

What are you aligning to, a full genome or genomic scaffolds?
It makes sense that if you map PE data to scaffolds (which are not a continuous fragment) then a lot of sequences will fail to map if your insert size causes them to fall off the end of the fragment that the first PE maps to.

If you do not care about your insert size i.e. not trying to re-sequence large regions of the genome, and have genomic scaffolds I would concatenate the PEs and map in single end mode
JackieBadger is offline   Reply With Quote
Old 10-09-2012, 02:20 PM   #7
jbrwn
Member
 
Location: Denver, CO

Join Date: Mar 2011
Posts: 37
Default

honestly, that seems about right. bowtie2 made improvements to paired-end, so you may want to check that out. paired-end specific options: http://bowtie-bio.sourceforge.net/bo...ed-end-options
jbrwn is offline   Reply With Quote
Old 10-10-2012, 10:38 AM   #8
north_zeb
Junior Member
 
Location: Ireland

Join Date: Sep 2012
Posts: 9
Default

I align against indexed hg19 downloaded from the bowtie website. i'll read that link. look at the beginning of the sam file bowtie gives me. Does anybody know what the 0 in the insert size position mean ?

SRR424618.6 HWIUSI-EAS523_0001:5:1:999:17802 77 * 0 0 * * 0 0 NGGCTTTAGTCAAAGTACAGAAGACATTAGAAGAAAATTGCAGAAACAGGCTGGGTTTGCANGCATGAATNCGNCA #''''52)+.88633AAAAAAAAAAAA7AA7AAA7A72A8AAAAAA7AA########################### XM:i:1
SRR424618.6 HWIUSI-EAS523_0001:5:1:999:17802 141 * 0 0 * * 0 0 NCAAACACCTGGTTGGCTATCTCCAATAACTGTGACGTATTCATGCCTGCAAACCCAGCNNNNNNNNNCANNNNNC #***('**+'::4:20*523AAA7AAAAAA############################################## XM:i:1
SRR424618.10 99 chr20 42794368 255 76M = 42794395 103 NATGGAACCACCTCAGGGCCTTGGTATTGCTGTTCCCTCTACCTGTAATGCCCTTCCTCCAGATACCTACNTGGCT #'**'0.0..AAAAA8AA77::85:AAAAA############################################## XA:i:1 MD:Z:0C69A5 NM:i:2
SRR424618.10 147 chr20 42794395 255 76M = 42794368 -103 TNNNNNTCNNNNNNNNNGTAATGCCCTTCCTCCAGATACCTACATGGCTCACCCTCTTGCCGTCTTCAAGCCTTTN ############################################################################ XA:i:1 MD:Z:1G0C0T0G0T2C0C0T0C0T0A0C0C0T58A0 NM:i:15
SRR424618.9 163 chr13 99753904 255 76M = 99753933 105 NAGACCAGCCGGAGCAACAAAAAATTAGCTAGGCATGGTGGTGCATGCCAGTGGTCCCANNNNNNNNNGANNNNNG #''**00222AAAAAAAAAAA27*7626667AAAA######################################### XA:i:1 MD:Z:0G58G0C0T0A0C0T0T0T0G2G0G0G0T0G0A0 NM:i:16
SRR424618.9 83 chr13 99753933 255 76M = 99753904 -105 TAGGCNTGGTGGTGCATGCCAGTGGTCCCAGCTACTTTGGAGGGTGAGATGTGAAGATCCCCTGAGCCCAGGAGTN ##################AAAA7AAA896:820*+*7AAAAAAAAAAAAAAAAAA8AAAAAAAAAA20.),*'*'# XA:i:1 MD:Z:5A69T0 NM:i:2
north_zeb is offline   Reply With Quote
Old 10-10-2012, 12:05 PM   #9
swbarnes2
Senior Member
 
Location: San Diego

Join Date: May 2008
Posts: 912
Default

Did you look at the binary flags? 77 means that neither read of the pair mapped.

141 means the same thing. Notice how neither has a mapping position either? the quality turns to junk in the end, that might be part of the problem.
swbarnes2 is offline   Reply With Quote
Old 10-10-2012, 12:52 PM   #10
north_zeb
Junior Member
 
Location: Ireland

Join Date: Sep 2012
Posts: 9
Default

oh, thanks for that actually, i have started to figure out some of the flags numbers but these are new to me. If i align only the first of the fast files , with the -m 1 option, it gives: reads with at least 1 alignment: 70,66%
The second fast file gives 59.07% reads with at least 1 alignment.
north_zeb is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:29 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO