I have sequence data that was created using several different primers. I found that the samples have large variation in sequences, depending on the primer used. I therefore wanted to create a file that randomly selects 1000 sequences, and then put these in a new FASTA file.
My strategy is to put the sequences in a hash, split by the '>'. Then, assign each a number, and then randomly select the numbers that are assigned to the sequences, and then put this in a new FASTA file.
I haven't gotten this to work yet, but does anyone know a simpler method for doing this?
My strategy is to put the sequences in a hash, split by the '>'. Then, assign each a number, and then randomly select the numbers that are assigned to the sequences, and then put this in a new FASTA file.
I haven't gotten this to work yet, but does anyone know a simpler method for doing this?
Comment