After analyzing a metagenomic sample with MG-RAST, you go typically to the Metagenome Overview page to see how it performed.
At sections 'Metagenome Summary' and 'Analysis Flowchart' you have some first numbers that give you and idea of the process. The thing is they are close but do not match completely. For example:
1) "7,739,035 sequences (48.5%) contain predicted proteins with known functions"
2) "a total of 7,662,611 predicted protein coding regions"
What are the differences between these two sections? What alignment parameters are there behind these statistics? I mean, identity (%), minimal length, max E-value.
Then, at 'Analysis Flowchart', it says:
"2,812,269 features (84.9% of annotated features) were assigned to functional categories"
What functional categories? SEED? And should that number match my Subsystems matrix I download based on 60% identity, 15 bp min-length and 10e-05 E-value (default)?
Cheers!
At sections 'Metagenome Summary' and 'Analysis Flowchart' you have some first numbers that give you and idea of the process. The thing is they are close but do not match completely. For example:
1) "7,739,035 sequences (48.5%) contain predicted proteins with known functions"
2) "a total of 7,662,611 predicted protein coding regions"
What are the differences between these two sections? What alignment parameters are there behind these statistics? I mean, identity (%), minimal length, max E-value.
Then, at 'Analysis Flowchart', it says:
"2,812,269 features (84.9% of annotated features) were assigned to functional categories"
What functional categories? SEED? And should that number match my Subsystems matrix I download based on 60% identity, 15 bp min-length and 10e-05 E-value (default)?
Cheers!
Comment