I suppose you could do two things. Remove the 1st base (if it is always N, which is kind of odd, see below) from all reads and remove the first C from your barcode file.
Hypothesis: Reason that first base is an N is because every sequence in this case will actually start with C (and then have GT). I am surprised that this worked for 2nd base onwards. Having low nucleotide diversity like this is not recommended.
Hypothesis: Reason that first base is an N is because every sequence in this case will actually start with C (and then have GT). I am surprised that this worked for 2nd base onwards. Having low nucleotide diversity like this is not recommended.
Comment