Unconfigured Ad

**clostridium40** · 09-23-2011, 11:33 AM

Phillip

This is great information, very helpful for planning out your RNA-seq experiments (which I'm learning is critical). I was wondering if you have observed a number of indexes (bar codes) that serves as the cut-off point between low diversity and high diversity with the Illumina sequencers. Is it just if you only have 2 or 3 indexes, since that is the combinations that you provided or did you still see issues with 4, 5 or even 6 samples? Thanks again for the very useful information.

Kerry

**pmiguel** · 09-23-2011, 12:24 PM

I think the critical factor is having a fair number of clusters "lit up" during any given scanning pass. Since there are 2 passes, one for "G" and "T" and another for "A" and "C" -- getting at least 5% or so of your cluster to "glow" should do the trick. Or most of it.

I think the scanner upon seeing a blank flow cell presumes something must be wrong and may attempt to change its focal depth. Not good!

I guess I could ask Dave to run the analysis to look for bad index pools of 4 or more indexes. But, as long as you had an MK compatible pair or triple in the pools, I would think you would be okay.

I should add I have not looked back to see where we got into problems exactly. But my sense was that after you got above 4 indexes, things seemed okay. But that may just be because I did not look carefully enough...
--
Phillip

**kmcarr** · 09-23-2011, 12:49 PM

Originally posted by pmiguel View Post

One final caveat. I did not mention to Dave that Illumina specifies reading 7 cycles for the index read -- so it can do CAFIE corrections on the first 6. So that throws a bit of a wrench in the mix...

--
Phillip

Careful Phillip. CAFIE is the other guys. Illumina does phasing correction.

**kmcarr** · 09-23-2011, 01:06 PM

Originally posted by pmiguel View Post

I find it maddening when our Illumina/CASAVA chokes on a TruSeq index pool and throws all the reads into the "unknown" directory. How does one avoid this?

Disclaimer: we have not tested these out yet. Your mileage may vary, etc...

Illumina sequencers do not handle low diversity sequence well. Using a low number of indexes (bar codes) in a single lane is likely to give you low diversity. In the most extreme case an entire scanner pass yields no no visible clusters. There are two types of scanner passes -- the A/C pass and the G/T pass. If your index pool has an A or C and a G or T at all 6 positions, you can avoid a blank scanner pass.

Is low diversity really the source of the your problem though? We just completed a run where in 4 of the lanes we had run single libraries prepared with TruSeq. This meant that for the index read there was a single base at each cycle, the ultimate in low diversity. All libraries had the correct barcode read for >98% of the passed filter reads. We have also successfully run lanes with only two barcodes (I don't know what he MK breakdown was for those).

On the other hand we have had complete fail of index reads even when there was a much higher order of pooling and thus very diverse barcodes. Our best guess in these cases is that the index primer failed to anneal efficiently.

**pmiguel** · 09-24-2011, 09:27 AM

Speculation on my part: but it looked like a diversity issue to me. We are novices at this, with only a handful of Illumina runs under our belts though. But lanes with several indexes in them seem to invariably demultiplex without incident. Whereas lanes with one index in them generally have 2 or 3 cycles with all the base calls given very low quality values and all the reads end up in the "undetermined" catagory.

It is possible that a recent CASAVA upgrade has fixed the issue. Under v1 the instrument software would choking so bad on single indexes that tech support had us reboot the constrol software before doing read2 so that it would re-calibrate its focus. But as of v3 chemistry (and whatever the software version that went along with that), the instrument would automatically recalibrate focus before read2. So I am sure there is action behind the scenes.

I am demultiplexing our most recent run now. Will let you know how it looks.

--
Phillip

**pmiguel** · 10-21-2011, 11:19 AM

So, we had exactly the same issues I describe above with the run I mention. Lanes with low diversity in their index sequences always failed to demultiplex. Lanes with >3 indexes always succeeded.

However after going round and round with Illumina tech support about this, I think this may no longer be an issue for HiSeqs and only be an issue for HiScanSQs. Apparently HiSeqs do a separate scan for each base? I don't have a HiSeq, so I don't know for sure. The HiScanSQ definitely scans A and C together and G and T together. That might be the issue.

Anyway, I did find out that the

--use-bases-mask

parameter can be used during demultiplexing to skip bases where the instrument has clearly defocused itself and is not collecting usable data.

--
Phillip

**BIG_SNP** · 10-27-2011, 04:10 PM

We have had good success running low (or no) diversity samples on the HiSeq using the Nugen kit which uses in-line barcodes. We simply use multiple inline barcodes for each sample which tricks the machine into thinking the libraries are diverse for the first critical cycles to pass phasing, etc.

**Jon_Keats** · 10-27-2011, 07:33 PM

One thing we do to deal with this issue is to whenever possible create large pools and spread them across multiple lanes instead of using 2-3 samples per lane. Beyond the issue of low-complexity barcodes in small pools there is less of a risk of losing all the data from a set of samples if one lane fails.

**biochembug** · 07-27-2012, 04:04 AM

Hi folks,

How is the following multiplexing (Barcodes 2, 3, 4, 5, 18, 19, 12, 13, 8 libraries) in a lane for TruSeq? In case, it is the problem to pool 2, 3 or 4 libraries I may be safe.

B.No. is barcode number.
B.No. --------Composition----------
18 G T C C G C
19 G T G A A A
12 C T T G T A
13 A G T C A A
2 C G A T G T
3 T T A G G C
4 T G A C C A
5 A C A G T G

Biochembug

**BIG_SNP** · 12-03-2013, 03:23 PM

question

Phillip

Many of the combinations you have listed require index 17:

for example...

A and B indexes:
7 17
3 17 20
4 17 24
5 17 25
6 10 17
17 18 19

but from the list you stated of what is included in Box A and Box B there is no index 17 included. Could you please help.

Thank you!

Originally posted by pmiguel View Post

I find it maddening when our Illumina/CASAVA chokes on a TruSeq index pool and throws all the reads into the "unknown" directory. How does one avoid this?

Disclaimer: we have not tested these out yet. Your mileage may vary, etc...

Illumina sequencers do not handle low diversity sequence well. Using a low number of indexes (bar codes) in a single lane is likely to give you low diversity. In the most extreme case an entire scanner pass yields no no visible clusters. There are two types of scanner passes -- the A/C pass and the G/T pass. If your index pool has an A or C and a G or T at all 6 positions, you can avoid a blank scanner pass.

The IUPAC code for A or C is "M" while the code for G or T is "K". So I like to think of a given index pool as "MK compatible" if there will be and M and a K at each of the 6 bases of the index. Which indexes are MK compatible?

A programmer in the lab, Dave, whipped up a script to analyze some groupings of indexes to find MK compatible pools within them. If you have a TruSeq RNA or DNA library prep kit 'version 2', you have 12 indexes. There are two types of kits, the "A" and the "B" kits, distinguished only by the indexes they use:
Box A:
2, 4, 5, 6, 7, 12, 13, 14, 15, 16, 18, 19
Box B:
1, 3, 8, 9, 10, 11, 20, 21, 22, 23, 25, 27

If Dave has it right, then these are the groupings that should minimize issues. That is, if you only have 2 or 3 libraries going into a single lane, here are the ones that, when pooled by row, are MK compatible. (For example, if you mix indexes 5 and 19 in a lane, every position will have an M and a K. Get it? Each row is good pool.)

A indexes only:
5 19
6 12
2 4 13
5 7 18

B indexes only:
3 11 21
8 20 25
9 10 21
10 22 25
10 25 27

A and B indexes:
5 19
6 12
7 17
18 25
3 5 8
1 5 9
1 4 12
2 3 12
7 10 12
1 3 13
2 4 13
8 9 13
2 11 14
7 9 14
2 11 15
7 9 15
5 11 16
10 13 16
6 10 17
1 14 18
1 15 18
5 7 18
17 18 19
3 17 20
3 11 21
4 18 21
8 16 21
9 10 21
12 14 21
12 15 21
5 6 22
12 19 22
2 5 23
8 12 23
14 16 23
15 16 23
4 17 24
14 22 24
15 22 24
19 21 24
5 17 25
7 19 25
8 20 25
10 22 25
1 11 26
2 18 26
7 23 26
8 10 26
9 16 26
13 21 26
20 22 26
5 6 27
10 25 27
12 19 27
14 24 27
15 24 27
20 26 27

How about if you have the Small RNA kit with all 48 indexes? There are a lot of possibilities. I'll just give you the 2 index MK compatible pools:

4 35
5 19
6 12
7 17
10 39
18 25
18 33
20 30
21 29
22 45
24 31
26 42
27 45
37 45

One final caveat. I did not mention to Dave that Illumina specifies reading 7 cycles for the index read -- so it can do CAFIE corrections on the first 6. So that throws a bit of a wrench in the mix...

--
Phillip

**pmiguel** · 12-04-2013, 08:58 AM

Originally posted by BIG_SNP View Post

Phillip

Many of the combinations you have listed require index 17:

for example...

A and B indexes:
7 17
3 17 20
4 17 24
5 17 25
6 10 17
17 18 19

but from the list you stated of what is included in Box A and Box B there is no index 17 included. Could you please help.

Thank you!

Hi Big_SNP,
Yes, I can help -- pool any indexes you like together. It doesn't matter any more. Illumina fixed this issue that caused low % demultiplexing due to unequal base representation around the time they got around to mentioning in the manuals it was a problem. Now we have manuals that warn against a non-existent problem.
Ah well...

--
Phillip

**HeinKey** · 12-11-2013, 06:56 AM

Hi Phillip,
For MiSeq runs I agree, but is RTA for HiSeq also changed? I was told the improvement for biased libraries was only for MiSeq RTA.

Hein

**pmiguel** · 12-11-2013, 08:52 AM

Originally posted by HeinKey View Post

Hi Phillip,
For MiSeq runs I agree, but is RTA for HiSeq also changed? I was told the improvement for biased libraries was only for MiSeq RTA.

Hein

I don't understand what you mean. This is not a library bias issue, is it? We are talking index reads, right?

Our HiSeq has never cared about index balance. It just isn't an issue. With other HiSeqs? I don't know.

Our previous sequencer: HiScanSQ -- there index balance mattered. But not for HiSeq or MiSeq. Not ever, in my experience.

--
Phillip

**GenoMax** · 12-11-2013, 08:59 AM

Only time problems with indexes show up on HiSeq is when one over clusters samples. Regular reads are resistant but the index reads tend to start accumulating N's (when samples are over clustered) leading to losses.

Topics	Statistics	Last Post
Whole-Genome Sequencing Traces Faroe Islands Ancestry to a North Atlantic Founder Population by SEQadmin2 Started by SEQadmin2, 06-17-2026, 06:09 AM	0 responses 24 views 0 reactions	Last Post by SEQadmin2 06-17-2026, 06:09 AM
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism by SEQadmin2 Started by SEQadmin2, 06-09-2026, 11:58 AM	0 responses 41 views 0 reactions	Last Post by SEQadmin2 06-09-2026, 11:58 AM
A New Method Makes Hantavirus Genome Analysis Faster and More Accessible by SEQadmin2 Started by SEQadmin2, 06-05-2026, 10:09 AM	0 responses 48 views 0 reactions	Last Post by SEQadmin2 06-05-2026, 10:09 AM
A New Single-Cell Method Maps DNA-Protein Interactions by SEQadmin2 Started by SEQadmin2, 06-04-2026, 08:59 AM	0 responses 49 views 0 reactions	Last Post by SEQadmin2 06-04-2026, 08:59 AM

Unconfigured Ad

Which indexes to pool.

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News