SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Illumina/Solexa



Similar Threads
Thread Thread Starter Forum Replies Last Post
Trouble With Trim_Galore processing paired reads Dave-bo_Baggins Bioinformatics 4 05-09-2013 08:41 AM
Ion Torrent reverse primer - P1 or B Retro Ion Torrent 9 03-12-2012 11:13 AM
50+% of my HiSEQ reads are 3' primer (custom primer used) ZAAB Sample Prep / Library Generation 3 02-06-2012 11:00 AM
Need reverse reads or not? Kevin_YY Bioinformatics 3 10-03-2010 12:12 AM
Need reverse reads or not? Kevin_YY 454 Pyrosequencing 0 09-27-2010 11:08 PM

Reply
 
Thread Tools
Old 06-30-2015, 10:46 AM   #1
bio_informatics
Senior Member
 
Location: USA

Join Date: Nov 2013
Posts: 182
Default Trouble: getting reads with Reverse primer

Hi SeqA team,

I've hit another road block.
Data information: 16s V3-V4 region, demultiplexed. Illumina, Paired end. PhiX 5%

I've forward, and reverse primer information.
F: CCTACGGGDGGCWGCA
R: GGACTACHVGGGTMTCTAATC

When I try to grep sequences with Forward primer in R1, I get reads from 1k-12k. With all degenerate primer combination.

However, I get less than 10 sequences with reverse primer, degenerate combination on R2.
I tried with ^ (starts), ($)ends with, nothing seems to give me reads for reverse primer.
Tried with reverse complement of reverse primer, no results.

1)
Wouldn't R2 sequences have reverse primers?

2)
I went ahead with forward, and reverse primer information to assemble the reads, with Mothur. I didn't get any output.
When I removed reverse primer information, I managed to get an output of decent file size.


The 16S sequencing was done as the protocol
bio_informatics is offline   Reply With Quote
Old 06-30-2015, 10:50 AM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,077
Default

Have you tried to find the reverse complement of R (i.e. GATTAGAKACCCBDGTAGTCC) with either reads?
GenoMax is offline   Reply With Quote
Old 06-30-2015, 10:55 AM   #3
bio_informatics
Senior Member
 
Location: USA

Join Date: Nov 2013
Posts: 182
Default

Hi genomax,
Thanks for your reply.

I tried with degenerate primers of reverse complement (GGACTACHVGGGTMTCTAATC).
No outputs. 0 .

I double checked with the one you shared. No outputs. 0.

I double checked with sequencing center, they said they have used R as reverse primers. And no other processing has been done apart from index removal (during de-multiplex).

Weird.
bio_informatics is offline   Reply With Quote
Old 06-30-2015, 10:58 AM   #4
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

By the way, BBDuk processes degenerate bases (with the copyundefined flag), and looks for both the forward and reverse-complement... you might find that to be a more robust alternative to grep.

bbduk.sh in=reads.fq out=matching.fq literal=CCTACGGGDGGCWGCA k=16 mm=f copyundefined=t

To get reads containing both the F and R sequences would require 2 sequential passes.
Brian Bushnell is offline   Reply With Quote
Old 06-30-2015, 11:05 AM   #5
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,077
Default

@Brian @bio_informatics could also use "kmercountexact.sh" to see what oligo is overrepresented? In case the sequence of R is incorrect?
GenoMax is offline   Reply With Quote
Old 06-30-2015, 11:06 AM   #6
bio_informatics
Senior Member
 
Location: USA

Join Date: Nov 2013
Posts: 182
Default

Hi Brian,
Thanks for adding information about bbduk. I shall try with that, too.

But the question haunts. Why there are no reads with reverse primer: either way - forward or reverse complement?
bio_informatics is offline   Reply With Quote
Old 06-30-2015, 11:13 AM   #7
bio_informatics
Senior Member
 
Location: USA

Join Date: Nov 2013
Posts: 182
Default

Hi genomax,

Thanks for the idea.
I pulled out fastqc, and in that looked for overrepresented sequences.
For reverse read.

Of many sequences, I've:
Over:
GACTACTGGGGTATCTAATCCTGTTTGATCCCCACGCTTTCGCACATCAG

Rev Primer:
GGACTACHVGGGTMTCTAATC

Clearly this is unmatched until the length of Reverse primer. (even leaving degenerate bases)

Quote:
Originally Posted by GenoMax View Post
@Brian @bio_informatics could also use "kmercountexact.sh" to see what oligo is overrepresented? In case the sequence of R is incorrect?

Last edited by bio_informatics; 06-30-2015 at 11:18 AM.
bio_informatics is offline   Reply With Quote
Old 06-30-2015, 11:16 AM   #8
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Quote:
Originally Posted by GenoMax View Post
@Brian @bio_informatics could also use "kmercountexact.sh" to see what oligo is overrepresented? In case the sequence of R is incorrect?
That's true, but since many parts of the V3-V4 region are so highly conserved... and, well, with amplicon sequencing in general, it might be hard to get any useful signal unless you restrict the search to only the end of the read, by trimming everything other than the first 22bp before the analysis, then reverse-complement the reads and repeat to get the other end.

You should actually be able to just do this visually - the primer sequence in question should be the first or last 22bp. What do you see there?
Brian Bushnell is offline   Reply With Quote
Old 06-30-2015, 11:22 AM   #9
bio_informatics
Senior Member
 
Location: USA

Join Date: Nov 2013
Posts: 182
Default

Quote:
Originally Posted by Brian Bushnell View Post
You should actually be able to just do this visually - the primer sequence in question should be the first or last 21bp. What do you see there?
Thanks for your reply.
I pulled out few sequence, trim until 21 bp. Below are 3 lines after trimming 21:

GACTACTCGGGTCTCTAATCC
GACTACTTGGGTATCTAATCC
GACTACAAGGGTCTCTAATCC

Quote:
GGACTACHVGGGTMTCTAATC
To which my reverse primer is terrible in matching with them.

Last edited by bio_informatics; 06-30-2015 at 12:04 PM.
bio_informatics is offline   Reply With Quote
Old 06-30-2015, 11:33 AM   #10
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,077
Default

Hopefully the right region has been amplified in that dataset.
GenoMax is offline   Reply With Quote
Old 06-30-2015, 11:36 AM   #11
bio_informatics
Senior Member
 
Location: USA

Join Date: Nov 2013
Posts: 182
Default

Quote:
Originally Posted by GenoMax View Post
Hopefully the right region has been amplified in that dataset.
Amen!, I too hope the same.
Primer is clearly different then the ones I'm able to see.
bio_informatics is offline   Reply With Quote
Old 06-30-2015, 11:44 AM   #12
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,077
Default

Someone must have used a wrong primer. Not much you can do but report to experimental folks.
GenoMax is offline   Reply With Quote
Old 06-30-2015, 11:46 AM   #13
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Actually, that's a perfect match:

Code:
 GACTACAAGGGTCTCTAATCC
GGACTACHVGGGTMTCTAATC
They only differ at the degenerate symbols, and the degenerate symbols are:

H: A or C or T
V: A or C or G
M: A or C

...which include the bases in question. For some reason it's offset by one base though. But that's probably not an issue with the primer, just where reading starts.
Brian Bushnell is offline   Reply With Quote
Old 11-10-2015, 10:51 AM   #14
bio_informatics
Senior Member
 
Location: USA

Join Date: Nov 2013
Posts: 182
Default

Quote:
Originally Posted by Brian Bushnell View Post
Actually, that's a perfect match:

Code:
 GACTACAAGGGTCTCTAATCC
GGACTACHVGGGTMTCTAATC
They only differ at the degenerate symbols, and the degenerate symbols are:

H: A or C or T
V: A or C or G
M: A or C

...which include the bases in question. For some reason it's offset by one base though. But that's probably not an issue with the primer, just where reading starts.
Thanks. This is what I have used for all the data received. All files have trimmed primer.
bio_informatics is offline   Reply With Quote
Reply

Tags
16s illumina analysis, mothur

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:15 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO