SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Demultiplexing qseq shadow19c Bioinformatics 5 07-24-2014 01:34 AM
Large No. small fragments Deep Amplicon Seq (454-GS-FLX)-demultiplexing problems aperdomos 454 Pyrosequencing 0 09-17-2012 10:28 PM
Template-specific bidirectional demultiplexing of sff files from 454 jimmybee Bioinformatics 4 04-04-2012 03:22 PM
Demultiplexing and CASAVA 1.7 tonio100680 Bioinformatics 14 06-16-2011 10:48 PM
demultiplexing 384 honey Bioinformatics 0 05-25-2011 12:03 PM

Reply
 
Thread Tools
Old 07-24-2014, 01:01 AM   #1
inesdesantiago
Member
 
Location: LONDON, UNITED KINGDOM

Join Date: Jan 2009
Posts: 44
Default Demultiplexing 454

Dear all,
I am trying to de-multiplex a 454 fasta file.

It is a public data file. In the paper (PMID:21731642) the authors say that:
"The HB4a and C5.2 Y-shaped adapters were formed by primers A and B and primers C and D, respectively (Primer A: 5′-GATCTCCCGAGTGGTCACCTGCTC-3′; Primer B: 5′-CTAGCAGCTACCACTCGGGA-3′; Primer C: 5′-GATCCCCTGAGTGGTCACCTGCTC-3′, and Primer D: 5′-CTAGCAGCTACCACTCAGGG-3′)"

So I did two runs of demultiplexing:

Run1:
Code:
$ cat barcode1.txt
HB4a_1  GATCTCCCGAGTGGTCACCTGCTC
C52_1     GATCCCCTGAGTGGTCACCTGCTC

$ cat ../HER2_Dataset_Download/SRR342054_1.fastq | fastx_barcode_splitter.pl --bcfile barcode1.txt --bol --mismatches 2 --prefix SRR342054_1_ --suffix ".fastq" 

Barcode	Count	Location
C52_1	0	SRR342054_1_C52_1.fastq
HB4a_1	0	SRR342054_1_HB4a_1.fastq
unmatched	802214	SRR342054_1_unmatched.fastq
Run2:
Code:
$ cat barcode2.txt
HB4a_2  CTAGCAGCTACCACTCGGGA
C52 _2    CTAGCAGCTACCACTCAGGG

cat SRR342054_1_unmatched.fastq | fastx_barcode_splitter.pl --bcfile barcode2.txt --bol --mismatches 2 --prefix SRR342054_2_ --suffix ".fastq"

Barcode	Count	Location
C52_2	214558	SRR342054_2_C52_2.fastq
HB4a_2	165481	SRR342054_2_HB4a_2.fastq
unmatched	422175	SRR342054_2_unmatched.fastq
total	802214
On the first run there are no reads that map those 2 sequences.
On the second run there are many unmatched reads (>50% of reads do no match any barcode)

Am I doing something wrong?
Does anyone have experience with this type of problem?

Thanks!
inesdesantiago is offline   Reply With Quote
Old 07-24-2014, 04:44 AM   #2
inesdesantiago
Member
 
Location: LONDON, UNITED KINGDOM

Join Date: Jan 2009
Posts: 44
Default

Actually, the reverse complement of the first barcode worked.
I decided to allow only 1 mismatch now

Barcode1 output
Barcode Count Location
C52_1 166119 SRR342054_1rev_C52_1.fastq
HB4a_1 134556 SRR342054_1rev_HB4a_1.fastq
unmatched 501539 SRR342054_1rev_unmatched.fastq
total 802214

Barcode 2 output
Barcode Count Location
C52_2 213282 SRR342054_2rev_C52_2.fastq
HB4a_2 135893 SRR342054_2rev_HB4a_2.fastq
unmatched 152364 SRR342054_2rev_unmatched.fastq
total 501539

Last edited by inesdesantiago; 07-24-2014 at 04:55 AM.
inesdesantiago is offline   Reply With Quote
Reply

Tags
454, barcode design, demultiplex, demultiplexing

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:14 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO