I'm a trained Hematologist but am new to research and genomic work. We want to develop a classifier for a gene and target genes; we have found in a cohort of samples many of these genes show coordinate enrichment (tumor samples profiled for about 16,000 genes and 8 gene sets of 21-130 genes each used).
Now I need to develop a classifier so we can use NanoString and categorise patients as high expressers of the set or low expressers. Ideally we want this to become a clinical test.
This is where I am floundering. How do I select the best gene set for a classifier? Does it need to be physiological or just statistical in origin?
Any advice gratefully received. I am improving my stats knowledge but it remains rudimentary so be gentle!
Now I need to develop a classifier so we can use NanoString and categorise patients as high expressers of the set or low expressers. Ideally we want this to become a clinical test.
This is where I am floundering. How do I select the best gene set for a classifier? Does it need to be physiological or just statistical in origin?
Any advice gratefully received. I am improving my stats knowledge but it remains rudimentary so be gentle!
Comment