[Note added 12/12/2013:Illumina fixed this issue. The info below is, at best, of historical interest. At least for HiSeqs and MiSeqs. Probably HiScanSQs as well. Even the worst case scenario of a single index in a HiSeq lane generally results in >95% demultiplexing. So, if most of your reads got thrown into the "unknown" folder, there is probably some problem, other than "color balance", that caused it.]
I find it maddening when our Illumina/CASAVA chokes on a TruSeq index pool and throws all the reads into the "unknown" directory. How does one avoid this?
Disclaimer: we have not tested these out yet. Your mileage may vary, etc...
Illumina sequencers do not handle low diversity sequence well. Using a low number of indexes (bar codes) in a single lane is likely to give you low diversity. In the most extreme case an entire scanner pass yields no no visible clusters. There are two types of scanner passes -- the A/C pass and the G/T pass. If your index pool has an A or C and a G or T at all 6 positions, you can avoid a blank scanner pass.
The IUPAC code for A or C is "M" while the code for G or T is "K". So I like to think of a given index pool as "MK compatible" if there will be and M and a K at each of the 6 bases of the index. Which indexes are MK compatible?
A programmer in the lab, Dave, whipped up a script to analyze some groupings of indexes to find MK compatible pools within them. If you have a TruSeq RNA or DNA library prep kit 'version 2', you have 12 indexes. There are two types of kits, the "A" and the "B" kits, distinguished only by the indexes they use:
Box A:
2, 4, 5, 6, 7, 12, 13, 14, 15, 16, 18, 19
Box B:
1, 3, 8, 9, 10, 11, 20, 21, 22, 23, 25, 27
If Dave has it right, then these are the groupings that should minimize issues. That is, if you only have 2 or 3 libraries going into a single lane, here are the ones that, when pooled by row, are MK compatible. (For example, if you mix indexes 5 and 19 in a lane, every position will have an M and a K. Get it? Each row is good pool.)
A indexes only:
5 19
6 12
2 4 13
5 7 18
B indexes only:
3 11 21
8 20 25
9 10 21
10 22 25
10 25 27
A and B indexes:
5 19
6 12
7 17
18 25
3 5 8
1 5 9
1 4 12
2 3 12
7 10 12
1 3 13
2 4 13
8 9 13
2 11 14
7 9 14
2 11 15
7 9 15
5 11 16
10 13 16
6 10 17
1 14 18
1 15 18
5 7 18
17 18 19
3 17 20
3 11 21
4 18 21
8 16 21
9 10 21
12 14 21
12 15 21
5 6 22
12 19 22
2 5 23
8 12 23
14 16 23
15 16 23
4 17 24
14 22 24
15 22 24
19 21 24
5 17 25
7 19 25
8 20 25
10 22 25
1 11 26
2 18 26
7 23 26
8 10 26
9 16 26
13 21 26
20 22 26
5 6 27
10 25 27
12 19 27
14 24 27
15 24 27
20 26 27
How about if you have the Small RNA kit with all 48 indexes? There are a lot of possibilities. I'll just give you the 2 index MK compatible pools:
4 35
5 19
6 12
7 17
10 39
18 25
18 33
20 30
21 29
22 45
24 31
26 42
27 45
37 45
One final caveat. I did not mention to Dave that Illumina specifies reading 7 cycles for the index read -- so it can do CAFIE corrections on the first 6. So that throws a bit of a wrench in the mix...
--
Phillip
I find it maddening when our Illumina/CASAVA chokes on a TruSeq index pool and throws all the reads into the "unknown" directory. How does one avoid this?
Disclaimer: we have not tested these out yet. Your mileage may vary, etc...
Illumina sequencers do not handle low diversity sequence well. Using a low number of indexes (bar codes) in a single lane is likely to give you low diversity. In the most extreme case an entire scanner pass yields no no visible clusters. There are two types of scanner passes -- the A/C pass and the G/T pass. If your index pool has an A or C and a G or T at all 6 positions, you can avoid a blank scanner pass.
The IUPAC code for A or C is "M" while the code for G or T is "K". So I like to think of a given index pool as "MK compatible" if there will be and M and a K at each of the 6 bases of the index. Which indexes are MK compatible?
A programmer in the lab, Dave, whipped up a script to analyze some groupings of indexes to find MK compatible pools within them. If you have a TruSeq RNA or DNA library prep kit 'version 2', you have 12 indexes. There are two types of kits, the "A" and the "B" kits, distinguished only by the indexes they use:
Box A:
2, 4, 5, 6, 7, 12, 13, 14, 15, 16, 18, 19
Box B:
1, 3, 8, 9, 10, 11, 20, 21, 22, 23, 25, 27
If Dave has it right, then these are the groupings that should minimize issues. That is, if you only have 2 or 3 libraries going into a single lane, here are the ones that, when pooled by row, are MK compatible. (For example, if you mix indexes 5 and 19 in a lane, every position will have an M and a K. Get it? Each row is good pool.)
A indexes only:
5 19
6 12
2 4 13
5 7 18
B indexes only:
3 11 21
8 20 25
9 10 21
10 22 25
10 25 27
A and B indexes:
5 19
6 12
7 17
18 25
3 5 8
1 5 9
1 4 12
2 3 12
7 10 12
1 3 13
2 4 13
8 9 13
2 11 14
7 9 14
2 11 15
7 9 15
5 11 16
10 13 16
6 10 17
1 14 18
1 15 18
5 7 18
17 18 19
3 17 20
3 11 21
4 18 21
8 16 21
9 10 21
12 14 21
12 15 21
5 6 22
12 19 22
2 5 23
8 12 23
14 16 23
15 16 23
4 17 24
14 22 24
15 22 24
19 21 24
5 17 25
7 19 25
8 20 25
10 22 25
1 11 26
2 18 26
7 23 26
8 10 26
9 16 26
13 21 26
20 22 26
5 6 27
10 25 27
12 19 27
14 24 27
15 24 27
20 26 27
How about if you have the Small RNA kit with all 48 indexes? There are a lot of possibilities. I'll just give you the 2 index MK compatible pools:
4 35
5 19
6 12
7 17
10 39
18 25
18 33
20 30
21 29
22 45
24 31
26 42
27 45
37 45
One final caveat. I did not mention to Dave that Illumina specifies reading 7 cycles for the index read -- so it can do CAFIE corrections on the first 6. So that throws a bit of a wrench in the mix...
--
Phillip
Comment