SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Error during bowtie2-align run , (ERR): bowtie2-align exited with value 139 harijay Bioinformatics 2 02-08-2017 01:24 AM
bowtie2 align out of memory cmccabe Bioinformatics 2 08-14-2015 06:53 AM
bowtie2-align dided with the signal 11 (SEGV) (core dumped) with bowtie2 harijay Bioinformatics 0 07-28-2015 01:49 AM
Expected bowtie2 to be in same directory with bowtie2-align nimrod337 Bioinformatics 4 04-11-2015 01:54 PM
Using Bowtie2 for to align paired end reads sewellh Bioinformatics 5 06-05-2014 10:50 AM

Reply
 
Thread Tools
Old 09-10-2015, 07:13 AM   #1
danova
Member
 
Location: France

Join Date: Sep 2010
Posts: 27
Default Bowtie2 - align two PCR primers help

Hi,
Not sure its the correct strategy by im trying to achieve the following:

1- I have two primers (see below) that map to 16s amplicons and they generate an insert of 1200 to 1400 nt long.

The two primers are described below.
>forward_primer
CCTACGGGNGGCWGCAG
>reverse_primer
GGATTAGATACCCBDGTAGTC

When i try to use bowtie2 to align the two primers to a reference database of different 16S sequences they are supposed to match i only get 600 hits (expected 400000 hits), meaning this primer set will align perfectly to 400000 different sequences of the database.

bowtie2 -f -t -a --no-dovetail --no-contain --no-overlap --no-mixed --fr --minins 800 --maxins 1500 -x 16S_database -1 forward_primer -2 reverse_primer


I have checked the orientation of the primers , -fr is the correct orientation.

It looks too me that bowtie2 handles degenerated primers differently then what i expected. Does anybody know how to improve the alignement of the degenerated primers (-N 1 option does not really improve). ANy idea ???
danova is offline   Reply With Quote
Old 09-10-2015, 09:34 AM   #2
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

This sounds like a good place to use BBDuk, depending on what exactly you're trying to do.

bbduk.sh in=16S.fasta out=matching.fasta literal=CCTACGGGNGGCWGCAG k=17 mm=f copyundefined=t

That will give you all sequences that contain the literal, after first making copies of the literal to represent every fully-specified (ACGT) version of the degenerate bases. You can then run it a second time with the other primer and a different value of k (which should equal the primer length).
Brian Bushnell is offline   Reply With Quote
Old 09-10-2015, 11:50 AM   #3
danova
Member
 
Location: France

Join Date: Sep 2010
Posts: 27
Default

Hi Brian,
At the end I want to see if my degenrated primers (or the sum of the oligos as you suggest) match the same 16s. That will tell me the coverage of my primers in my 16s database.

Indeed that looks like a valid option. As you suggest i can align both separately and then combine primers that match the same 16s , check the distance between the primers to satisfy the 800 to 1500 nt in between. Then i guess extract the aligned sequences (start -end positions of the alignement).

Ill give it a try,
thanks,
david

Last edited by danova; 09-10-2015 at 11:54 AM.
danova is offline   Reply With Quote
Old 09-10-2015, 12:21 PM   #4
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

If the distances are important, I have another program that's slightly different, but may be useful. Usage:

Code:
msa.sh in=16S.fa out=mapped1.sam literal=CCTACGGGNGGCWGCAG

msa.sh in=16S.fa out=mapped2.sam literal=GGATTAGATACCCBDGTAGTC

cutprimers.sh in=16S.fa out=cut.fa sam1=mapped1.sam sam2=mapped2.sam
The sam files indicate the best alignment of the adapter sequence to every input sequence. This may not be optimal in your case since it allows indels (I designed it for PacBio reads) but in practice it really won't matter. The sam files may themselves be useful to you; and the output of cutprimers is the subsequences between the primers. msa.sh does not understand ambiguity codes (I think it converts them to N) so you may get better results if you manually list all permutations of the sequence as a comma-delimited list. I'll probably build in automatic permutation sometime in the future.

Also, it's probably best to only run this on the sequences after first filtering with BBDuk, because msa.sh is very aggressive and will output the most closely-matching site for each primer even if it's not a very good match (containing indels and substitutions, for example). Again, that's because I designed it to operate on PacBio 16S reads (to pull out the V4 region).
Brian Bushnell is offline   Reply With Quote
Old 09-10-2015, 12:30 PM   #5
danova
Member
 
Location: France

Join Date: Sep 2010
Posts: 27
Default

HI Brian,
Im also working with Pacbio so definetly it looks pretty much close to what im trying to achieve. Im currently also evaluating pacbio primers and having the sam output is really helpfull. I need to look in detail to this and will post back.
david
danova is offline   Reply With Quote
Old 09-11-2015, 11:41 AM   #6
danova
Member
 
Location: France

Join Date: Sep 2010
Posts: 27
Default

Brian,
That was really great and worked as expected, however i need more control over the alignements. How can you count the number of mismatches (limit to 3 mismatches, with no indels)

Have tried also bbduk but get only small number of aligned sequences (419). I expect at least 60% of my database to match this primer.
Executing jgi.BBDukF [in=ALL16s_GOLD_HOMD.fa, out=matching.fasta, literal=CCTACGGGNGGCWGCAG, k=17, mm=f, copyundefined=t]

BBDuk version 35.43
Initial:
Memory: max=29020m, free=28565m, used=455m

Added 8 kmers; time: 0.018 seconds.
Memory: max=29020m, free=28263m, used=757m

Input is being processed as unpaired
Started output streams: 0.012 seconds.
Processing time: 0.239 seconds.

Input: 12551 reads 19118812 bases.
Contaminants: 12132 reads (96.66%) 18494334 bases (96.73%)
Result: 419 reads (3.34%) 624478 bases (3.27%)

Time: 0.275 seconds.
Reads Processed: 12551 45.59k reads/sec
Bases Processed: 19118k 69.45m bases/sec

Last edited by danova; 09-11-2015 at 12:43 PM.
danova is offline   Reply With Quote
Reply

Tags
bowtie2

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:07 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO