Seqanswers Leaderboard Ad

**hoytpr** · 04-20-2017, 07:49 AM

Answering my own question: The long "G"s are likely adapter/primer dimers, where the read one primer bound, but not the read two primer. Not yet sure if this is due to the use of NEB+BIOO+Illumina kits. The 35 "N"s are from adapter/primer dimers being N-masked.
-p

**GenoMax** · 04-20-2017, 08:06 AM

Illumina's phiX is not indexed (if I recall right). The GGGG's basically is NextSeq saying I see no signal = G basecalls (2-color chemistry).

**hoytpr** · 04-20-2017, 09:13 AM

I'm sure you are correct, thanks. The weirdness (among other things) was the numerical disparity between read 1 and read 2. Maybe because the read2 index in the plate file wasn't entered (which causes the read2 to be automatically 'masked'), and also the libraries had very low complexity (e.g long runs of Gs). In the end, the run worked fine but may need some cleanup.

After a couple days with Illumina, they said :

I pulled the 47,000 sequences that gave Ns in R1 from R2 and it looks like pretty much all of them have 100% G calls. A G call in NextSeq is usually associated with dark cycles (or no image). This tells me that the R2 primer may not have bound correctly to these obviously adapter dimer sequences (since R1 is 35bp of Ns) and that is why we don't see the adapters in R2 region and hence no subsequence conversion to Ns.

This is still a little murky to me, so I plan to draw out the whole NEB+12bp-BIOO+Illumina indexing on paper.

**jdk787** · 04-20-2017, 11:45 AM

Originally posted by hoytpr View Post

I'm sure you are correct, thanks. The weirdness (among other things) was the numerical disparity between read 1 and read 2.

I don't have any hands on experience with the NextSeq, but there seems to be a tendency for NextSeq data to have more polyG sequences in R2 than in R1. Possibly due to some inefficiency in the Read 2 re-synthesis. Maybe this is worse for adapter dimers?

A first look at Illumina’s new NextSeq 500 - SEQanswers

http://seqanswers.com/forums/showthread.php?t=40741&page=5

Registered SEQanswers sponsors/vendors can post commercial content here. Please support our sponsors!

poly-G in NextSeq - SEQanswers

http://seqanswers.com/forums/showthread.php?t=45130

Bridged amplification & clustering followed by sequencing by synthesis. (Genome Analyzer / HiSeq / MiSeq)

bcl2fastq default will also mask reads that are less than 22 bases after trimming with Ns along with adapter dimers.

https://support.illumina.com/content/dam/illumina-support/documents/downloads/software/bcl2fastq/bcl2fastq2-v2-18-software-guide-15051736-01.pdf

Maybe because the read2 index in the plate file wasn't entered (which causes the read2 to be automatically 'masked')

If you don't enter an I5 index than the sequencer will just read I7 and not I5. So for paired end single index there will be reads for R1, I7, and R2. The I5 read will be skipped and the I7 will be used to demux.

http://www.bea.ki.se/documents/Illumina_Indexed_Sequencing_Reference_Guide_Aug2015.pdf

**hoytpr** · 04-20-2017, 11:57 AM

Thanks very much for the links and clarification.

-p

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 26 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 29 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 25 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

New NextSeq: First run. A few issues.

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News