SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Illumina/Solexa (http://seqanswers.com/forums/forumdisplay.php?f=6)
-   -   Sequencing a Low diversity library on the HiSeq (http://seqanswers.com/forums/showthread.php?t=15606)

Simcom 11-17-2011 11:23 AM

Sequencing a Low diversity library on the HiSeq
 
I am preparing a custom multiplexed library that will fall into the "low diversity" category. Low diversity meaning the first 5 nucleotides of read 1 will be identical among all clusters. There is a well known and well documented problem with cluster identification for low diversity libraries (outlined here: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3030592/ ).

The above paper and many of the comments on these forums refer specifically to the GAII, and suggest that spiking the library with 40-50% phiX control resolves the cluster calling issue.

Now that the GAIIx has been all but phased out, I need to run my low diversity library on the HiSeq. The problem is that I don't know of anyone that has successfully run a low diversity library on a HiSeq, and my core informed me today that they have tried several times to run low diversity libraries but got awful results on the HiSeq, even after spiking with phiX %50.

My question is, has anyone had success running low a diversity library on the HiSeq? If so, how did you manage to get it to work. Because my study does not require a massive number of reads, I am considering spiking my sample with up to 90-95% gDNA, hopefully drastically increasing the diversity and resolving cluster identification problems. Does anyone have experience running low diversity libraries on the HiSeq that could give me some advice?

Thanks so much!

HESmith 11-17-2011 11:48 AM

Is there a reason not to use a custom sequencing primer?

Simcom 11-17-2011 12:19 PM

I am sequencing viral DNA/gDNA integration junctions, so I am using that 5 nucleotides of viral DNA as a type of 'verification', that indeed a read contains the junction between viral DNA and gDNA, essentially showing that the sequencing primer is not mis-hybridizing to a similar (non-viral) sequence elsewhere in the genome. We are using a custom sequencing primer, but we prefer that it not hybridize up the the very edge of the viral DNA for reason described above.

GenoMax 11-17-2011 12:29 PM

Can your core not run your samples by specifying a different lane (which is expected to have "normal" DNA) as the "control" lane for that run?

HESmith 11-17-2011 12:43 PM

Simcom: Judicious primer design coupled with the appropriate annealing temperature will virtually assure that the primer does not hybridize inappropriately.

Genomax: the designation of a normal complexity sample as the control lane does not solve the problem (sadly). While it allows the signal thresholds to be set appropriately, it doesn't address the problem of cluster resolution in the low complexity lanes.

Simcom 11-17-2011 12:50 PM

Quote:

Originally Posted by GenoMax (Post 57169)
Can your core not run your samples by specifying a different lane (which is expected to have "normal" DNA) as the "control" lane for that run?

I think you are misunderstanding the problem. The issue is, because the first 5nt of read #1 are going to be all identical among clusters, and the machine uses these 5nt to call clusters, the machine has a hard time identifying/differentiating between different clusters (especially close overlapping clusters). So the reason for the gDNA is to add diversity to the sequence, allowing clusters to be called. Hence the reason it needs to be included in the same lane as the sample.

HESmith 11-17-2011 12:53 PM

There's an alternative approach, assuming that you have not yet constructed the libraries. Design them so the junction is at the opposite end of the insert, and perform paired-end sequencing. Cluster calling is based only on the first five cycles of read one, so you'll avoid the low-complexity issue.

Simcom 11-17-2011 12:59 PM

Quote:

Originally Posted by HESmith (Post 57171)
Simcom: Judicious primer design coupled with the appropriate annealing temperature will virtually assure that the primer does not hybridize inappropriately.

I agree with you that hybridization is unlikely, but if it does happen it will be indistinguishable from an actual integration. Among other things, we are interested in mapping low-abundance integrations, so if we aren't able to get the verification sequence on every read, we will likely need to go in and verify a subset of integrations manually, which may not be possible if an integration is present in only one cell for example.

Simcom 11-17-2011 01:04 PM

Quote:

Originally Posted by HESmith (Post 57176)
There's an alternative approach, assuming that you have not yet constructed the libraries. Design them so the junction is at the opposite end of the insert, and perform paired-end sequencing. Cluster calling is based only on the first five cycles of read one, so you'll avoid the low-complexity issue.

Yep, exactly. Sadly my boss insisted on having a library that we can do single read OR paired end on (a money saving move potentially), so I had to design the junction on the first read side. And the samples are just about finished being prepped :/

HESmith 11-17-2011 01:05 PM

If, as you say, you can tolerate discarding 90-95% of the reads, then spiking in a gDNA library at that level will definitely solve your problem. After all, adapter dimers are often present at 5-10% in many libraries (the same % as your desired samples), and sequencing them is not a problem!

Simcom 11-17-2011 01:11 PM

Quote:

Originally Posted by HESmith (Post 57181)
If, as you say, you can tolerate discarding 90-95% of the reads, then spiking in a gDNA library at that level will definitely solve your problem. After all, adapter dimers are often present at 5-10% in many libraries (the same % as your desired samples), and sequencing them is not a problem!

Thanks, this gives me confidence. Just to be sure though: do the adapter-adapter ligation reads come back in the data, or does the machine throw them out and not include them in sequencing results? If you actually get the adapter-adapter reads from the machine, I should be golden.

pmiguel 11-17-2011 01:30 PM

We got it to work on our HiScanSQ -- which uses the same chemistry as the HiSeq, but only scans the top of the flowcell. Not an identical situation, but we had some SMART cDNAs that we sheared and ligated TruSeq adapters on. So about 1/2 of them had the same 50 nt of SMART primer at the beginning. We mixed them 1:1 with a genomic DNA library. Cluster registration went fine.

--
Phillip

HESmith 11-17-2011 01:32 PM

Yes, adapter reads are present.

Just remember that you'll also need to include the standard sequencing primer for the gDNA library (or construct the gDNA library with custom adapters to match your custom primer).

Simcom 11-17-2011 01:52 PM

Quote:

Originally Posted by HESmith (Post 57186)
Yes, adapter reads are present.

Just remember that you'll also need to include the standard sequencing primer for the gDNA library (or construct the gDNA library with custom adapters to match your custom primer).

Yep, I planned on including both primers. Thank you so much for your help, I really appreciate it. :D

Simcom 11-17-2011 02:00 PM

Quote:

Originally Posted by pmiguel (Post 57185)
We got it to work on our HiScanSQ -- which uses the same chemistry as the HiSeq, but only scans the top of the flowcell. Not an identical situation, but we had some SMART cDNAs that we sheared and ligated TruSeq adapters on. So about 1/2 of them had the same 50 nt of SMART primer at the beginning. We mixed them 1:1 with a genomic DNA library. Cluster registration went fine.

--
Phillip

OK, that is good to hear. I'm not sure why my core was having trouble spiking 1:1 gDNA.

BIG_SNP 11-17-2011 04:04 PM

We have had great success using the NuGen library prep. Their adapters have inline barcodes which adds to the diversity for the first cycles and allows sequences to pass filter. After passing filter the HiSeq can sequence the no or low diversity samples without any problems.

fkrueger 11-18-2011 02:32 AM

Just as another thought, if you could afford to spike in 90-95% gDNA, couldn't you also find an external sequencing facility who still run GAIIx's and use the methods which work well on these?

pmiguel 11-18-2011 04:59 AM

I have seen bad batches of phiX that had fairly high (a few percent, I think) adapter dimers levels in them. Maybe you should make your own genomic DNA library to make sure your "diluent" is of high quality.

Also you could obtain your "diluent" by sub-contracting a sequencing job. Send an email out to a prospective department (maybe one with a high level of plant or fungal sciences being done) and offer a one time only discount genome sequence. Our diluent was a sorghum genomic DNA library.

--
Phillip

HMorrison 11-18-2011 06:55 AM

Quote:

Originally Posted by HESmith (Post 57176)
There's an alternative approach, assuming that you have not yet constructed the libraries. Design them so the junction is at the opposite end of the insert, and perform paired-end sequencing. Cluster calling is based only on the first five cycles of read one, so you'll avoid the low-complexity issue.


I have a sample of 96-plex low diversity amplicon libraries running now and clusters were found just fine--but the low diversity is causing a tremendous discrepancy between the blue and the green box-and-whiskers plot--raw clusters and clusters passing filter. I hope those data are recoverable at the end. Nothing in my primer design, barcoding, indexing scheme can change the fact that it's "low complexity". First four bases were completely random and followed by eight different in-line bar codes.

This is PE sequencing.

Yet I know labs are making this work.

HESmith 11-18-2011 07:24 AM

If the first four bases are random, then subsequent low complexity should not adversely affect cluster calling or data quality. Excessive cluster density is a possible culprit: what are your raw and PF values?


All times are GMT -8. The time now is 07:03 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.