SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Illumina/Solexa



Similar Threads
Thread Thread Starter Forum Replies Last Post
18s V4 region in Illumina MiSeq Amplicon 18sSpace Illumina/Solexa 6 03-26-2014 06:41 AM

Reply
 
Thread Tools
Old 06-13-2016, 02:42 PM   #1
KBlackwell
Junior Member
 
Location: USA

Join Date: Jun 2016
Posts: 1
Default Hypervariable region(s) of choice for Illumina MiSeq

I am having trouble determining why certain hypervariable regions are targeted. The papers I have reviewed all seem to say it depends on your study, such as whether you're interested in the rare biosphere or specific types of bacteria or archaea.

However, I am not necessarily sure what should be used for environmental samples from the deep ocean, which is what I am looking at in my project. I will also be using Illumina MiSeq. Some papers I have looked at use V1-V2 and others use V6 or V6-V8, however a reason isn't provided.

Any advice would be greatly appreciated!
KBlackwell is offline   Reply With Quote
Old 06-13-2016, 05:50 PM   #2
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

I think the region targeting has more to do with available read lengths than anything else. Personally... I think it's best to use the region that has the most comprehensive databases, to enhance your ability to classify whatever you are sequencing while also maximizing your contributions. Another factor to consider is the conservation of primer sites; the more highly-conserved, the more likely your data will have some kind of relationship with the actual abundance. If you are interested in archaea, be sure to design your primers appropriately...
Brian Bushnell is offline   Reply With Quote
Old 06-14-2016, 06:37 AM   #3
thermophile
Senior Member
 
Location: CT

Join Date: Apr 2015
Posts: 234
Default

Agreed, length is the biggest consideration. This is why v6 was first chosen, it's the shortest. Then JGI chose v6-9 for their standard for 454 (though they sequenced from 3' so rarely were you able to keep v6). I liked v1-3 for 454 soil communities (maximized diversity recovered) but that's too long for current Illumina, so now I use v4 like everyone else. Unless you are targeting a specific group of organisms, I'd suggest sticking with v4.
thermophile is offline   Reply With Quote
Old 06-14-2016, 03:15 PM   #4
nucacidhunter
Jafar Jabbari
 
Location: Melbourne

Join Date: Jan 2013
Posts: 1,189
Default

I agree with above comments. Following would affect identified taxa or species:
1- sampling, storage and handling and DNA extraction
2- targeted hyper variable region
3- library prep method
4- database used for identification

It is important to note that in each case one gets only one view of the community composition and other views are possible as well.

I have not seen any publication trying all variable regions and then choosing one for the bulk of their material.
nucacidhunter is offline   Reply With Quote
Old 06-15-2016, 09:46 AM   #5
thermophile
Senior Member
 
Location: CT

Join Date: Apr 2015
Posts: 234
Default

Quote:
Originally Posted by nucacidhunter View Post
I have not seen any publication trying all variable regions and then choosing one for the bulk of their material.
I saw posters of JGI's efforts on that front back in the day-probably 2009ish. Not sure if they ever published it.
thermophile is offline   Reply With Quote
Old 12-05-2016, 04:09 AM   #6
strawbaubz
Junior Member
 
Location: South Aafrica

Join Date: Apr 2015
Posts: 5
Default

Hi guys sorry for the random poat but not sure where I could ask this question, whihc seems a little stupid but here goes.

I sequenced from the V1V3 and V3V4 regions. I looked at the quality profile of my sequences and found that my V1V3 reads had way poorer per base quality that V3V4.

I used the Illumina Miseq 2X300bp to sequence my reads. I am trying to come up with a reason for this but can't seem to grasp the concept.

Is possible that the sequences within the V3V4 can trigger more errors and thus poorer quality profiles? HELP??

I apologize if this is a stupid question.
strawbaubz is offline   Reply With Quote
Old 12-05-2016, 07:48 AM   #7
thermophile
Senior Member
 
Location: CT

Join Date: Apr 2015
Posts: 234
Default

those products are way too long, you should be aiming for near complete overlap. Additionally the error profiles for the v3 2x300 kits are worse than the v2 2x250.
__________________
Microbial ecologist, running a sequencing core. I have lots of strong opinions on how to survey communities, pretty sure some are even correct.
thermophile is offline   Reply With Quote
Old 12-05-2016, 12:37 PM   #8
strawbaubz
Junior Member
 
Location: South Aafrica

Join Date: Apr 2015
Posts: 5
Default

Thanks for reply.

I meant to say my V1V3 is worst than V3V4. You mention that the error profiles are worst for V3 kits than V2. I see the opposite?

I read a paper that said some sequence patterns can trigger more errors than others. Can this be happening to my data? I am writing up results and not sure if I have to try explain why it is that my V1V3 results did worst?

I know that the overlap was small but I am talking about the reads not the merged sequences?
strawbaubz is offline   Reply With Quote
Old 12-05-2016, 12:45 PM   #9
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

If you have very high homogeneity for a while at the beginning of the read, it will reduce the quality there. The effects can carry over to some extent for the remainder of the read, even when you get into variable regions. This effect can be reduced with staggered adapters. It's also possible that this is sample-specific, if you are running different organisms, or amplification-specific, if different organisms are amplifying well.
Brian Bushnell is offline   Reply With Quote
Old 12-05-2016, 12:53 PM   #10
nucacidhunter
Jafar Jabbari
 
Location: Melbourne

Join Date: Jan 2013
Posts: 1,189
Default

Quote:
Originally Posted by strawbaubz View Post
Hi guys sorry for the random poat but not sure where I could ask this question, whihc seems a little stupid but here goes.

I sequenced from the V1V3 and V3V4 regions. I looked at the quality profile of my sequences and found that my V1V3 reads had way poorer per base quality that V3V4.

I used the Illumina Miseq 2X300bp to sequence my reads. I am trying to come up with a reason for this but can't seem to grasp the concept.

Is possible that the sequences within the V3V4 can trigger more errors and thus poorer quality profiles? HELP??

I apologize if this is a stupid question.
Possible reasons:

1- batch to batch variation of sequencing kits
2- higher cluster density of V1-V3 run
3- differences in PhiX spike in
4- differences in sequence diversity of regions in your samples, lower sequence variation along reads will result in lower quality

I do not think sequence error will reduce Q score. A base with high Q score can be erroneous.

Last edited by nucacidhunter; 12-05-2016 at 12:59 PM.
nucacidhunter is offline   Reply With Quote
Old 12-06-2016, 06:35 AM   #11
thermophile
Senior Member
 
Location: CT

Join Date: Apr 2015
Posts: 234
Default

When you look at the individual reads the qscores are worse v1v3 vs v3v4 or merged reads are worse. v1v3 is longer so you have less overlap therefor I'd expect poorer quality in the merged reads.

Base homogeneity across bacteria is greater in the beginning of v1 than v3 so i don't think that's your answer.
__________________
Microbial ecologist, running a sequencing core. I have lots of strong opinions on how to survey communities, pretty sure some are even correct.

Last edited by thermophile; 12-06-2016 at 07:09 AM. Reason: clarifying
thermophile is offline   Reply With Quote
Old 12-06-2016, 12:50 PM   #12
nucacidhunter
Jafar Jabbari
 
Location: Melbourne

Join Date: Jan 2013
Posts: 1,189
Default

Quote:
Originally Posted by thermophile View Post

Base homogeneity across bacteria is greater in the beginning of v1 than v3 so i don't think that's your answer.
Low sequence diversity at the start of read will affect number of reads passing filter with no or little impact on the quality of bases past the low diversity region. However, a low diversity region, for instance, in the middle of the read will reduce the Q scores for those bases.

Sequence heterogeneity of the target region for some samples can be very low if they are composed of bacteria mainly from one particular phyla even though that region might be very heterogeneous if considered globally among all bacteria.
nucacidhunter is offline   Reply With Quote
Reply

Tags
hypervariable regions, sequecing

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 10:08 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO