SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Illumina/Solexa



Similar Threads
Thread Thread Starter Forum Replies Last Post
Merging paired end reads for BLAST JJenks Bioinformatics 9 11-05-2018 10:40 AM
Merging non-overlapping paired end reads karenr Illumina/Solexa 9 12-16-2016 07:02 PM
Paired-end reads into two different files Fad2012 Bioinformatics 14 08-21-2013 05:31 AM
Merging Paired-End FastQ Files jmpi Bioinformatics 3 05-22-2013 08:21 AM
Bfast alignement with paired end reads in separate files david.tamborero Bioinformatics 2 11-29-2011 08:49 AM

Reply
 
Thread Tools
Old 10-09-2013, 08:30 AM   #1
vectorborne5
Junior Member
 
Location: Richland, Washington, USA

Join Date: Oct 2013
Posts: 2
Default Merging paired end reads (R1 and R2 files)

To anyone who may have dealt with Illumina MiSeq paired end reads: what are the best programs/scripts to use when merging the R1 and R2 (forward and reverse) read pairs into extended consensus..es (is there a plural for 'consensus')? I've tried four different program suites (pRESTO, flash, fastq-join, seqimp) and they all miss many, many read pairs that obviously overlap, sometimes even better than the reads that were merged. I sequenced adapter ligated microRNAs with barcodes. The MiSeq was readily able to identify the barcodes and eliminated most of the adapter sequence, and the read qualities are generally quite good, but I'm disturbed that my mergers aren't coming out as they should. I don't want to just take one of the files and accidentally identify a microRNA in a read that actually might have been a tRNA, or some other sequence type that I would have been able to eliminate by discovering that it was in fact a longer sequence after a proper read merger.

Any advice would be most appreciated!

My apologies if this question has been asked previously (tried searching, but nothing came up using various keyword combinations).
vectorborne5 is offline   Reply With Quote
Old 10-09-2013, 09:40 AM   #2
JackieBadger
Senior Member
 
Location: Halifax, Nova Scotia

Join Date: Mar 2009
Posts: 381
Default

FLASH generally works very well. However, it can miss and give you unexpected/false overlaps if your sequences are not properly trimmed.
MiSeq software for adapter trimming is very untrustworthy. You should re-Q your fastq files with adapter trimming turned off. Then FASTQC your PE reads to make sure that the base composition is exactly how you expected it.

I remember in my last FLASH run, there were some bases at the 5' of R1 that were not expected (~9 bases of an adapter or something..I forget), this screwed up the automated overlapping, which worked very well in previous runs.

So my advice would be to sanity check your sequences in relation to the expected sequence composition and length of overlapping region. You can also play around with the FLASH parameters.. I think you will be able to resolve it
JackieBadger is offline   Reply With Quote
Old 10-09-2013, 11:13 AM   #3
kcchan
Senior Member
 
Location: USA

Join Date: Jul 2012
Posts: 182
Default

If you're running MiSeq Reporter 2.3 there's a new setting called "StitchReads" that can do this for you. Simply add StitchReads 1 to your [Settings] section of the sample sheet and you're good to go.

For more details, you can refer to the release notes for MiSeq 2.3.
kcchan is offline   Reply With Quote
Reply

Tags
adapter, illumina, merge, miseq, pair end reads

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 05:37 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO