Hi All
I am trying to split multiplexed 454 data using sfffile, and trim the MIDs. The core facility used the same MID sequence in the 5' and 3', but when I run sfffile, it removes only the MID in the 5'.
For example, the sequence below:
>G164TNN01B12AF rank=0007978 x=726.0 y=1925.0 length=80
ACGCTCGACAGAGGCTGCCTCTGATCCCAGCTACTCAGAAGGTTGTAGGTGGGAGACTCAGCTTGCCAGGACGCTCGACA
contains the MID "ACGCTCGACA" in both ends, but after running sfffile, I get as output the sequence:
>G164TNN01B12AF length=70 xy=0726_1925 region=1 run=R_2011_05_05_15_47_37_
GAGGCTGCCTCTGATCCCAGCTACTCAGAAGGTTGTAGGTGGGAGACTCAGCTTGCCAGGACGCTCGACA
which still contains the MID in its 3'.
I looked at the "sfffile" command parameters, but couldn't find any parameter that tells the program to search the MID in "both" ends.
Any ideas how can I trimmed the MID from both ends?
Thanks
Mali
I am trying to split multiplexed 454 data using sfffile, and trim the MIDs. The core facility used the same MID sequence in the 5' and 3', but when I run sfffile, it removes only the MID in the 5'.
For example, the sequence below:
>G164TNN01B12AF rank=0007978 x=726.0 y=1925.0 length=80
ACGCTCGACAGAGGCTGCCTCTGATCCCAGCTACTCAGAAGGTTGTAGGTGGGAGACTCAGCTTGCCAGGACGCTCGACA
contains the MID "ACGCTCGACA" in both ends, but after running sfffile, I get as output the sequence:
>G164TNN01B12AF length=70 xy=0726_1925 region=1 run=R_2011_05_05_15_47_37_
GAGGCTGCCTCTGATCCCAGCTACTCAGAAGGTTGTAGGTGGGAGACTCAGCTTGCCAGGACGCTCGACA
which still contains the MID in its 3'.
I looked at the "sfffile" command parameters, but couldn't find any parameter that tells the program to search the MID in "both" ends.
Any ideas how can I trimmed the MID from both ends?
Thanks
Mali