Dear all,
I have to analyze a set of 26 samples of 16S amplicon data, coming from 250 nt Paired-end Illumina Hi-Seq reads. When I received those sequences they were already demultiplexed , merged and converted into FASTA format. I have no access to Barcode and Primer sequence since the commercial provider who performed the sequencing refuses to provide such information (they say it is confidential information).
After extensively reading qiime documentation and multiple forum questions about how to analyze this kind of sequences, I'm afraid I'm one step beyond in the difficulty of this issue (or one step behind by not understanding the information I read...we will see).
I face 2 main problems:
1) The FASTA header of the sequences.
The current header has this format:
>Sample_Name tagX (Where X is the number of each consecutive tag from 1 to N)
After reading the add_qiime_labels documentation (http://qiime.org/scripts/add_qiime_labels.html) I understand that my header is completely different from that in the examples:
>Sample.1_0 FLP3FBN01ELBSX length=250 xy=1766_0111 region=1 run=R_2008_12_09_13_51_01_ AACAGATTAGACCAGATTAAGCCGAGATTTACCCGA
And I have no means of obtaining all the information lacking in my headers.
2)How to create a functional mapping file for qiime taking into account my current FASTA headers.
I guess this second issue can be fixed easily if the first Issue can be fixed.
Thanks in advance.
JL
I have to analyze a set of 26 samples of 16S amplicon data, coming from 250 nt Paired-end Illumina Hi-Seq reads. When I received those sequences they were already demultiplexed , merged and converted into FASTA format. I have no access to Barcode and Primer sequence since the commercial provider who performed the sequencing refuses to provide such information (they say it is confidential information).
After extensively reading qiime documentation and multiple forum questions about how to analyze this kind of sequences, I'm afraid I'm one step beyond in the difficulty of this issue (or one step behind by not understanding the information I read...we will see).
I face 2 main problems:
1) The FASTA header of the sequences.
The current header has this format:
>Sample_Name tagX (Where X is the number of each consecutive tag from 1 to N)
After reading the add_qiime_labels documentation (http://qiime.org/scripts/add_qiime_labels.html) I understand that my header is completely different from that in the examples:
>Sample.1_0 FLP3FBN01ELBSX length=250 xy=1766_0111 region=1 run=R_2008_12_09_13_51_01_ AACAGATTAGACCAGATTAAGCCGAGATTTACCCGA
And I have no means of obtaining all the information lacking in my headers.
2)How to create a functional mapping file for qiime taking into account my current FASTA headers.
I guess this second issue can be fixed easily if the first Issue can be fixed.
Thanks in advance.
JL
Comment