SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
PhD position in machine learning and oncoviral genomics at CBS, Denmark tsp Academic/Non-Profit Jobs 0 12-10-2011 12:53 AM
stanford's machine learning applied in bioinformatics delinquentme Bioinformatics 6 11-23-2011 05:19 AM
Companies offering NGS data analysis including RNAseq agseq Bioinformatics 0 05-05-2011 06:24 AM
PubMed: Bayesian Modeling of MPSS Data: Gene Expression Analysis of Bovine Salmonella Newsbot! Literature Watch 0 12-18-2010 11:11 AM
PubMed: Relative power and sample size analysis on gene expression profiling data. Newsbot! Literature Watch 0 09-18-2009 02:00 AM

Reply
 
Thread Tools
Old 05-31-2011, 11:54 AM   #1
Chuckytah
Member
 
Location: Barcelos, Braga, Portugal

Join Date: Mar 2011
Posts: 65
Question [NGS - analysis of gene expression data] Machine Learning + RNAseq data

Hello,

i'm a portuguese student girl of master degree in bioinformatics

Do you advise me some good websites/articles for theese topics:
--‐ Next generation sequencing for gene expression measurement (RNA‐seq)
--‐ data analysis challenges in RNA--‐Seq data
--‐ applications of machine learning in classification of RNAseq data
--‐ identification and critical analysis available tools


I wanted to focus more in "applications of machine learning in
classification of RNAseq data" do you know some pratical case that i could
present to my classmates?

Thanks a lot,

Inęs Martins
University of Minho, Portugal
Chuckytah is offline   Reply With Quote
Old 06-07-2011, 12:19 AM   #2
steven
Senior Member
 
Location: Southern France

Join Date: Aug 2009
Posts: 269
Default

Hi Inęs,

Here is a recent review about RNA-seq challenges in bioinformatics:
http://www.nature.com/nmeth/journal/...d=NMETH-201106

However, no real "machine learning" method comes to my mind about RNA-seq.. I do not recall Cufflinks or Scripture are trained on some learning dataset. Are they?
steven is offline   Reply With Quote
Old 06-07-2011, 12:30 AM   #3
altodor
Member
 
Location: Sweden

Join Date: Nov 2009
Posts: 12
Default Storey paper

There are excellent references in this paper:

http://www.nature.com/nbt/journal/v2..._id=NBT-201104

John Storey

H. Craig Mak
Nature Biotechnology 29, 331–333 (2011) doi:10.1038/nbt.1831
Published online 08 April 2011
John Storey provides his take on the importance of new statistical methods for high-throughput sequencing.

Good luck!
altodor is offline   Reply With Quote
Old 06-13-2011, 01:27 PM   #4
Simon Anders
Senior Member
 
Location: Heidelberg, Germany

Join Date: Feb 2010
Posts: 991
Default

By "machine learning", do you mean clustering and classification? To my k nowledge, not much has been done there yet so far. This is because, typically, you want to have large data sets with tens, better hundreds, of samples, to bring ML techniques to fruitful use and then, microarrays are still preferred as they are still cheaper. So, have a look at what people have done for microarray studies. I'm sure people will pretty soon start thinking about how to adapt these methods to RNA-Seq, so stay tuned.
Simon Anders is offline   Reply With Quote
Old 06-13-2011, 04:21 PM   #5
Chuckytah
Member
 
Location: Barcelos, Braga, Portugal

Join Date: Mar 2011
Posts: 65
Default

thank you all guys
Chuckytah is offline   Reply With Quote
Old 03-03-2012, 09:27 AM   #6
Chuckytah
Member
 
Location: Barcelos, Braga, Portugal

Join Date: Mar 2011
Posts: 65
Default

In the topic: Next generation sequencing for gene expression measurement

i'm doing RA (relative abundances)... I have the number of EST in one gene and i divide it by the total number of EST in the sample... is this right?
Chuckytah is offline   Reply With Quote
Old 03-05-2012, 01:11 AM   #7
sphil
Senior Member
 
Location: Stuttgart, Germany

Join Date: Apr 2010
Posts: 192
Default

It's not enough to normalise with the number of est in sample. let me referr you to RPKM (PMID: 18516045) or packages like deg-seq, rna-seqc.
sphil is offline   Reply With Quote
Old 03-05-2012, 03:16 AM   #8
steven
Senior Member
 
Location: Southern France

Join Date: Aug 2009
Posts: 269
Default

Quote:
Originally Posted by Chuckytah View Post
In the topic: Next generation sequencing for gene expression measurement

i'm doing RA (relative abundances)... I have the number of EST in one gene and i divide it by the total number of EST in the sample... is this right?
I am confused about the terminology. What is your data/experiment exactly? ESTs are not reliable and should not be used to infer expression values. If you are doing some Digital Gene Expression or SAGE experiment then it is possible. With full RNA-seq too but methods differ: with SAGE tags you do not expect the length of the transcript to matter as the goal is to get all the reads of a given transcript to originate from a unique position. For Whole Transcriptome Sequencing RNA-seq reads should originate from all over the transcripts. Therefore the number of reads is expected to correlate with the size of the transcript, so a normalization may be required (see RPKM above). Note that if you are comparing gene expressions between different conditions this size-normalization is not required.
steven is offline   Reply With Quote
Reply

Tags
data analysis, machine learning, ngs, rnaseq

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:06 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO