Hi
I'm wondering if I could get some advice for experts on this forum. I am working with the Illumina metagenomic dataset and using MEGAN to determine the taxonomic diversity.
In summary I have done this:
- Assembled the reads into contigs (82.5% of reads aligned in contigs >500 bp)
- Predicted CDS on contigs >500 bp
- Run BLASTP on CDSs against nr NCBI, only 1 blast hits for each gene was kept
- Uploaded BLASTP results to MEGAN for analysis
My understanding is that MEGAN needs more than one blast hit for each query for the LCA algorithm to be properly working and since I have only 1 hit for each query, the sequences are not assigned to taxon using LCA. My question is: is the method I used for taxonomic classification correct? I am at the stage of writing manuscript and worry that I did the analysis wrong.
Thank you in advance for all answers, advices and criticism!
Camila
I'm wondering if I could get some advice for experts on this forum. I am working with the Illumina metagenomic dataset and using MEGAN to determine the taxonomic diversity.
In summary I have done this:
- Assembled the reads into contigs (82.5% of reads aligned in contigs >500 bp)
- Predicted CDS on contigs >500 bp
- Run BLASTP on CDSs against nr NCBI, only 1 blast hits for each gene was kept
- Uploaded BLASTP results to MEGAN for analysis
My understanding is that MEGAN needs more than one blast hit for each query for the LCA algorithm to be properly working and since I have only 1 hit for each query, the sequences are not assigned to taxon using LCA. My question is: is the method I used for taxonomic classification correct? I am at the stage of writing manuscript and worry that I did the analysis wrong.
Thank you in advance for all answers, advices and criticism!
Camila
Comment