View Single Post
Old 06-28-2015, 09:46 PM   #1
vinnie_assemble
Junior Member
 
Location: USA

Join Date: Jun 2015
Posts: 2
Unhappy Help in sff file processing in amplicon analysis via mothur

Hi,

I am using my 454 data for OTU analysis in mothur. And I am confused after transform my sff to a fasta file. Sequencing information, platform 454 FLX, flow pattern TACG, barcode (AAAAAAAC) removed by sequencing center, primer: GAGTTTGATCNTGGCTCAG.

However, I have trouble to understand the sequence section (from 5th base to 12nd base). The primer started from 13 base. I attached the output fasta format from different toolkit.

sff_extract (from seq_crumbs toolkit) with clipping:

GAGTTTGATCCTGGCTCAGATTGAACGCTGG....

sff_extract (from seq_crumbs toolkit) without clipping:

tcagagagcgaaGAGTTTGATCCTGGCTCAGATTGAACGCTGG...

mothur output after denoise:

AGAGCGAAGAGTTTGATCCTGGCTCAGATTGAACGCTGG...

Does anyone can help to understand the sequence agagcgaa part? Base on the sequencing center information, it does not belong to barcode. And how should I deal with it? For example, it there a way to remove this region in mothur? Thank you!
vinnie_assemble is offline   Reply With Quote