Hi,
I'm working on a RNA-seq experiment of a non-model organism. I assembled a transcriptome using Trinity, and ran GOseq to examine for GO enrichment in DE transcript clusters. I use as background genes all the DE transcripts > 1 TPM.
The analysis ran, and returned a multitude of GO terms. However, I'm questioning the validity of the analysis because many of the assembled DE transcripts (say 2/3) do not have a match on Swissprot (even when BLASTed with loose parameters) and thus no GO annotation.
Therefore, my question is how does GOseq deals with missing annotations? Especially for the background genes used in the analysis, is it acceptable to assume that GO terms for 2/3 of the expressed transcript are representative of the whole set of transcripts? Is it what GOseq does?
Thanks,
Antoine
I'm working on a RNA-seq experiment of a non-model organism. I assembled a transcriptome using Trinity, and ran GOseq to examine for GO enrichment in DE transcript clusters. I use as background genes all the DE transcripts > 1 TPM.
The analysis ran, and returned a multitude of GO terms. However, I'm questioning the validity of the analysis because many of the assembled DE transcripts (say 2/3) do not have a match on Swissprot (even when BLASTed with loose parameters) and thus no GO annotation.
Therefore, my question is how does GOseq deals with missing annotations? Especially for the background genes used in the analysis, is it acceptable to assume that GO terms for 2/3 of the expressed transcript are representative of the whole set of transcripts? Is it what GOseq does?
Thanks,
Antoine