SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
PacBio Sequel demultiplexing Ali May Pacific Biosciences 4 12-06-2016 09:43 PM
Help with demultiplexing inline dual barcodes on paired end Illumina sequences Kurt Lamour Bioinformatics 0 10-05-2015 11:41 AM
Identify adapter sequences in pacbio reads coldturkey Pacific Biosciences 15 06-23-2015 10:04 AM
PacBio Library Prep workshop and PacBio SMART-Portal bootcamp - UC Davis - April 2015 DNATECH Events / Conferences 1 04-02-2015 08:33 AM
'n' in PacBio assembled sequences? shuang Bioinformatics 6 12-04-2012 12:40 PM

Reply
 
Thread Tools
Old 11-23-2016, 06:17 AM   #1
uloeber
Member
 
Location: Germany

Join Date: Mar 2013
Posts: 40
Question Demultiplexing PacBio sequences

Hi all,
I have the following problem:
I have a dataset containing circurlarized and linear DNA. It was amplified with two pairs of primers using inverse PCR. I'd like to dissect the data by the following scheme:
Primer1a----------Primer1b
Primer2a----------Primer2b
Primer1a---------- or ----------Primer1a
Primer1b---------- or ----------Primer1b
Primer2a---------- or ----------Primer2a
Primer2b---------- or ----------Primer2b

I tried to use cutadapt for this issue, but it seems that for the paired primers I only get the sequences with e.g. 1a at 5', 1b at 3' and have to specify it the other way around.
cutadapt -g ^P1 -a P2\$ --trimmed-only -e 0.05 --no-trim >out.fasta in.fasta
I need an approach where I can specify a certain error rate as well as the possibility to search for example within the first and last 40 bp for my 20 bp primer.
Do you have any suggestions???
Thanks in advance!
uloeber is offline   Reply With Quote
Old 11-23-2016, 12:54 PM   #2
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

You can use BBDuk to look for this kind of thing:

bbduk.sh in=reads.fq out=unmatched.fq outm=matched.fq ref=primer1a.fa k=20 restrictleft=40 edist=3 -da

That will allow an edit distance of 3 and only look for matching kmers in the first 40 bp of the reads. You can subsequently trim the reads in another pass:

bbduk.sh in=matched.fq out=trimmed.fq ref=primer1a.fa k=20 restrictleft=40 edist=3 ktrim=l -da

For the adapters on the right end, you'd use "ktrim=r" and "restrictright=40".

Last edited by Brian Bushnell; 11-23-2016 at 01:03 PM.
Brian Bushnell is offline   Reply With Quote
Old 11-23-2016, 01:03 PM   #3
uloeber
Member
 
Location: Germany

Join Date: Mar 2013
Posts: 40
Default

Hi Brian,
Thanks for your reply! I forgot to mention, I want to keep the reads untrimmed. And would your software care about the "both primers present" issue? Otherwise I have again to check every possibility separately for every sample. That's what I try to avoid. But again, thank you very much! I appreciate your help. Bbnorm is one of my favorites.
Best wishes,
Ulrike
uloeber is offline   Reply With Quote
Old 11-23-2016, 01:09 PM   #4
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Hi Ulrike,

The first command will not do any trimming, only filtering. And no - unfortunately, BBDuk will not handle both ends at once, so you would need 2 passes. However, if you have a lot of different primers, you can put them all in a file and demultiplex with Seal like this:

seal.sh in=reads.fq pattern=out_%.fq ref=primers.fa restrictleft=40 k=20 edist=3 -da

That would produce one output file per primer sequence. Might save some time.
Brian Bushnell is offline   Reply With Quote
Reply

Tags
cutadapt, demultiplexing, pacbio, pcr, primer

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:52 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO