SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
MiSeq Bacterial Genome Multiplexing eosin Illumina/Solexa 4 03-28-2012 05:02 PM
Multiplexing attila.szanto RNA Sequencing 2 12-02-2011 07:32 AM
Multiplexing prep. James Sample Prep / Library Generation 0 09-27-2010 03:15 AM
multiplexing smallRNA moguin Illumina/Solexa 7 09-16-2010 02:06 PM
Multiplexing bioinfosm Illumina/Solexa 0 09-24-2008 09:05 AM

Reply
 
Thread Tools
Old 08-10-2012, 10:09 AM   #1
Cirno
Junior Member
 
Location: British Columbia

Join Date: Jun 2011
Posts: 6
Default Help with De-Multiplexing MiSeq Data

Hello,

I cannot find a decent program or script to de-multiplex data where I have 3 fastq files: XXX_R1.fastq, XXX_R2.fastq, and XXX_I.fastq. The Index file has the same structure as a fastq and shares all the read hashes, but only has the barcode; I want to split the R1 and R2 files based on this barcode.

Any Suggestions?

Thanks.
Cirno is offline   Reply With Quote
Old 08-10-2012, 10:21 AM   #2
Bukowski
Senior Member
 
Location: Aberdeen, Scotland

Join Date: Jan 2010
Posts: 388
Default

I think you will find there is a reference to the barcode/sample at the end of the read name for each read. That might help.

Last edited by Bukowski; 08-10-2012 at 10:24 AM.
Bukowski is offline   Reply With Quote
Old 08-10-2012, 10:31 AM   #3
celzinga
Junior Member
 
Location: MA

Join Date: Nov 2011
Posts: 2
Default

This thread may help:
http://seqanswers.com/forums/showthread.php?t=17620
celzinga is offline   Reply With Quote
Old 08-10-2012, 10:42 AM   #4
celzinga
Junior Member
 
Location: MA

Join Date: Nov 2011
Posts: 2
Default

also it looks like picard can do this:
http://picard.sourceforge.net/comman...luminaBarcodes
celzinga is offline   Reply With Quote
Old 08-10-2012, 03:40 PM   #5
Cirno
Junior Member
 
Location: British Columbia

Join Date: Jun 2011
Posts: 6
Default

Quote:
Originally Posted by celzinga View Post
also it looks like picard can do this:
http://picard.sourceforge.net/comman...luminaBarcodes

Um. I don't see how that tool has anything to do with this problem. I don't need to extract the barcodes at all. I have three fastq files. First fastq is the barcodes already, I.E.:

Code:
@M00511:27:000000000-A1F08:1:1:17545:1321 1:N:0:0
AACCGAGA
+
?AAAAAAB
@M00511:27:000000000-A1F08:1:1:16720:1322 1:N:0:0
AACCGAGA
+
???A?@@B
@M00511:27:000000000-A1F08:1:1:17118:1322 1:N:0:0
AACCGAGA
+
A?AAAAAA
@M00511:27:000000000-A1F08:1:1:17183:1322 1:N:0:0
AAACATCA
+
AAAAABBB
Then the two files for both paired ends...I.E.:

Code:
@M00511:27:000000000-A1F08:1:1:17545:1321 1:N:0:0
NCGGGCACGACCATCACCATCATCATACGACGAACCAACGGGCATTATTCTGGTCGTTCGTCCTGATTGCGACGTTCATGGTCGTCGAAGTCATCGGCGGATTATGGACGAACAGTTTTGCGCTCTTGTCGGACGCCGGGCATATGCTTAG
+
#5<???AADDEEEDDDGGGGGGIIIIIIIIHHHHHHIIHHHHHHIIIIIIIIHIIHHHIHHHHHIIIIHHHHHHHHFHHHHHHGGFGGGGGGGGGGEGGG'.8:C*CCCD4A''*1CE*0:8'4C.:*:?)''.'.'.''2'**0*1:?:1
@M00511:27:000000000-A1F08:1:1:16720:1322 1:N:0:0
NCATACGTACCACCGATGACACCACCGACAAGCGGAACCATCTTCCCAAGATTAACGACCCCCGTATTCCCGAACTTCGTCAATAAGCGGAATCCGACTTTCTGATTGATTTTTTTGATGGTCGATCCAGGAATCTTCTTAATCATATTGA
+
#5<???BBDDDDDEDDFEFFFFIIIHHHHHHHIHHEHHIHIIIIIIIIIIIIIIIIHHHHHHHHDCFHHFHHHEHFDFH?DF;DFFDFEE=EFFA?A@BAEEFFEEEF=ABA?:8>DACAECEDD8A8*?*0:CCA0*::C*:ACA*:E:*
@M00511:27:000000000-A1F08:1:1:17118:1322 1:N:0:0
NTCCGCGTGACGGCGATGCCAGAGCGACGGGCCGCCTCGACGTTCGAGCCGACGTAATAAAACTCACGTCCTGTCTTCGAATACGTCAAAAACAGATGCGCCCCGGCGAAGAACAGAAGCATCAAGATGGCGACGAACGGGACAGGTCCGT
+
#5<???@@DDDDDDDDEEEFFFHHIHHHHHHHHHHHHHHHHHHHHEFHHHHEFFEFFEFFEEFFFFFFFFEEFFFFFFFEFFEFFFEE8A:CEEFEFEFDEADD?DDD'8>8?C:?E:*?:CAE0?::**:2'8;>2>').?8A))1*0'*
@M00511:27:000000000-A1F08:1:1:17183:1322 1:N:0:0
NATCGGAAGAGCACACGTCTGAACTCCAGTCACAAACATCATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAAAAAAAAGACAGAACGAGACAAAAGAAGCACAAATCCGTAATCGATGAGACTTAATGCGAGATCATGACACCATTGTAA
+
#5<???AAEDEDDDDDGGGGGGIIIIIIIIIIIIIIIIIIIIIIIIHHIIIIHHHIIIIIIIIHHIIIIHHHHHD4)42**,,,,,,***3*,4,,,*4,,,3,0****)0*))*)0.************)).'0*1******)*******
and the according mate-pairs of all of those.

I do not want three files as they are. I know which barcodes go with which hashes.

RUN1_I1.fastq
RUN1_R1.fastq
RUN1_R2.fastq

Need to be converted into...

RUN1_R1_AACCGAGA.fastq
RUN1_R2_AACCGAGA.fastq
RUN1_R1_AAACATCA.fastq
RUN1_R2_AAACATCA.fastq

etc etc.

Personally I am beyond flabbergasted that the output of this damnable thing is not the same as the HiSeq - I just want the fastqs sorted by the barcode, it does nothing for me the user to have the barcode/has pairs in a separate file.
Cirno is offline   Reply With Quote
Old 08-13-2012, 03:53 AM   #6
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,978
Default

Did you get this run at a core facility? I am not sure why that facility did not do the de-multiplexing for you. It should be trivial for them to do this since they would have access to the raw data folder and CASAVA pipeline.
GenoMax is offline   Reply With Quote
Old 08-13-2012, 06:44 AM   #7
geertvandeweyer
Member
 
Location: Antwerp, Belgium

Join Date: Jan 2011
Posts: 14
Default

Hi,

I've attached my approach to demultiplexing the MiSeq files. Note that it uses the MiSeq assigned sample idx to name the output files, NOT the barcode. This means you get all reads for the sample, also those with a mismatch in the barcode. It outputs three files per sample: forward reads, reverse reads, and interlaced reads. We use the interlaced reads in galaxy for batch workflow starting.

For files:
RUN1_I1.fastq
RUN1_R1.fastq
RUN1_R2.fastq

Run as:
perl demultiplex_miseq.pl RUN1

Output will be in 'output/' folder. It will also create a file containing all barcodes used per sample, and print the read count per sample.
Attached Files
File Type: pl demultiplex_MiSeq.pl (2.8 KB, 143 views)
geertvandeweyer is offline   Reply With Quote
Old 08-13-2012, 05:04 PM   #8
JackieBadger
Senior Member
 
Location: Halifax, Nova Scotia

Join Date: Mar 2009
Posts: 381
Default

http://code.google.com/p/ea-utils/

or

https://main.g2.bx.psu.edu/root
Look under NGS Toolbox Beta, NGS: QC and manipulation

Barcode splitter and other FASTQ manipulations
JackieBadger is offline   Reply With Quote
Old 08-16-2012, 01:51 PM   #9
swNGS
Member
 
Location: SW UK

Join Date: Nov 2011
Posts: 83
Default

What is an interlaced read?
swNGS is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:48 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO