I am a rookie still in this area, this is the first thing I was requested to do: to extract a list of 100% matched reads from a self-generated database. However, the reads' names are not formatted in the regular way. I assume that's what I am encountering now.
Below is a a list of my reads' names:
this is part of my entry_batch input file -- ID.txt
'M00344:4:000000000-A5RU9:1:2119:17016:21751 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2119:6591:19854 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2119:11445:14212 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2119:22676:7504 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2119:13009:4084 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2119:14454:4004 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2118:11021:19828 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2118:14025:16724 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2118:25864:15172 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2118:13018:13673 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2118:5760:11441 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2117:24461:19844 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2117:17300:18233 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2117:4137:17412 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2117:2789:15268 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2117:25164:15029 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2117:16039:7681 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2117:8713:5016 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2116:13795:20195 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2116:6977:17108 2:N:0:10'
I used commands below:
$ blastdbcmd -db seqs.fasta -dbtype nucl -entry_batch ID.txt -out miseq.read.fasta
Error messages:
Error: 'M00344:4:000000000-A5RU9:1:1104:13049:19775: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:13044:19758: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:13062:19751: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:11099:18531: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:11118:18521: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:17175:17791: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:17452:17720: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:16737:13751: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:16726:13733: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:19339:9296: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:17187:8943: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:14936:7801: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:21379:6845: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:23493:5643: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:26299:4746: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:23691:4053: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:15699:3766: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1103:18377:16637: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1103:16030:10176: OID not found
I tried changing white space to \s, or add ' before and after each id names, but it didn't help at all. The blastdbcmd program recognizes anything before the space as the id names. Anyone has any idea how to do it? Or I am totally heading in the wrong direction?
Eddi
Below is a a list of my reads' names:
this is part of my entry_batch input file -- ID.txt
'M00344:4:000000000-A5RU9:1:2119:17016:21751 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2119:6591:19854 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2119:11445:14212 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2119:22676:7504 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2119:13009:4084 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2119:14454:4004 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2118:11021:19828 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2118:14025:16724 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2118:25864:15172 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2118:13018:13673 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2118:5760:11441 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2117:24461:19844 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2117:17300:18233 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2117:4137:17412 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2117:2789:15268 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2117:25164:15029 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2117:16039:7681 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2117:8713:5016 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2116:13795:20195 2:N:0:10'
'M00344:4:000000000-A5RU9:1:2116:6977:17108 2:N:0:10'
I used commands below:
$ blastdbcmd -db seqs.fasta -dbtype nucl -entry_batch ID.txt -out miseq.read.fasta
Error messages:
Error: 'M00344:4:000000000-A5RU9:1:1104:13049:19775: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:13044:19758: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:13062:19751: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:11099:18531: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:11118:18521: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:17175:17791: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:17452:17720: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:16737:13751: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:16726:13733: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:19339:9296: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:17187:8943: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:14936:7801: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:21379:6845: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:23493:5643: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:26299:4746: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:23691:4053: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1104:15699:3766: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1103:18377:16637: OID not found
Error: 'M00344:4:000000000-A5RU9:1:1103:16030:10176: OID not found
I tried changing white space to \s, or add ' before and after each id names, but it didn't help at all. The blastdbcmd program recognizes anything before the space as the id names. Anyone has any idea how to do it? Or I am totally heading in the wrong direction?
Eddi
Comment