SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
ChIP-Seq: Enabling Data Analysis on High-Throughput Data in Large Data Depository Usi Newsbot! Literature Watch 1 04-18-2018 10:50 PM
[NGS - analysis of gene expression data] Machine Learning + RNAseq data Chuckytah Bioinformatics 7 03-05-2012 04:16 AM
Cuffdiff Data Analysis lucatape RNA Sequencing 0 04-05-2011 09:09 AM
Illumina Data Analysis ssharma Bioinformatics 7 10-26-2010 09:05 AM
mRNAseq-data analysis Malabady Illumina/Solexa 0 05-07-2009 01:08 PM

Reply
 
Thread Tools
Old 10-11-2011, 04:04 AM   #1
Kashliks
Junior Member
 
Location: Latvia

Join Date: Oct 2011
Posts: 6
Default Data analysis A-Z

Hi,
This is wonderful forum but i wonder why there isn't (at least I cant find) NGS data analysis from A-Z for dummies thread.
Like:
1. If you're using 454, then align reads using XX software and XX reference sequence with settings x y z
example command: samtools mpileup [-EBug] [-C capQcoef] [-r reg] [-f in.fa] [-l list] [-M capMapQ] [-Q minBaseQ] [-q minMapQ] in.bam

2...
3...


Z. And here is your genome sequence with genetic variant annotation and full common qc steps done, ready for case-control (familial, ect.) studies (or whatever you do with it)

Any help with this dream of mine, because I am noob in this NGS thing but have to learn.

Thank you!
Kashliks is offline   Reply With Quote
Old 10-11-2011, 05:29 AM   #2
Simon Anders
Senior Member
 
Location: Heidelberg, Germany

Join Date: Feb 2010
Posts: 994
Default

If you want an honest but maybe unwelcome answer: Because good data analysis cannot be done by following a cookbook recipe. There is no single correct way to do it, because two different analysis tasks are hardly ever the same. There is a reason that most tools offer you so many options and tuning parameters.

To be frank, it distresses me a bit how many new users stumble in this forum and hope that they can analyze their data just by googling, and without having to read papers and reviews and textbooks.

I am not a wetlab person but I image there is also no single explanation on "how to perform a chromatin immunoprecipitation for dummies". I imagine you may need very different procedures depending on what kind of sample and what antibody you are working with, what you hope to find etc., and not o forget, you need to know how to check that everything went well, and hence, you will not get around reading a lot before starting.

Once a technique is decades old, there might be standard approaches (typically, I imagine: buy some kit, put it into tube with sample, shake) but high-throughput sequencing is still under active development and recommended practices change monthly.

So, please don't take this personally, but as an advice to you as a newcomer in the field: Please understand that this is a subject as complex and in need of good planning as any other part of an experiment.
Simon Anders is offline   Reply With Quote
Old 10-11-2011, 06:20 AM   #3
Kashliks
Junior Member
 
Location: Latvia

Join Date: Oct 2011
Posts: 6
Default

Thanks for honesty.

Still most of commercial packages offer "fast analysis" with default settings in case there is no quality problems and offering novices point to start.
And I believe that in your projects there are settings that you use for more than 50% of your samples (and changing parameters that are more connected with computing power available and samples needed to be analysed) (excuse me for making these assumptions, but my experience tells me that it is the case with most methods in wet lab and different data statistical analysis).

I don't do ChIPing, but here is point to start:

http://mcardle.oncology.wisc.edu/sug...%20Dummies.pdf


Excuse me for my ignorance.
Kashliks is offline   Reply With Quote
Old 10-11-2011, 06:44 AM   #4
ETHANol
Senior Member
 
Location: Western Australia

Join Date: Feb 2010
Posts: 308
Default

That ChIP protocol is essentially the old Farnham protocol. While it has worked well for a lot of experiments, there have been significant improvements made. I am also sure that different protocols work better for some epitopes then others. IgG use to be the 'control' for ChIP and now people use relative enrichment compared to a negative locus normalized to input. That being said, ChIP has been around for a long time so there are some pretty good kits out there.

Peak calling for ChIP has matured pretty well and you could get away with a cookie cutter approach for most experiments. But expect that to change still. And if you want to get more creative with your analysis you're stuck.

But that is just ChIP. The samples sequenced by next-gen sequencing are not just from a bunch of techniques but are also from vastly different fields of biology. And as already mentioned things are moving forward at the speed of light right now. Look at all the file formats. It's a big mess. Everybody wants something different.

Companies like CLC are working on making nice user friendly programs but you are going to be a step behind the curve and be pretty inflexible with your analysis if you limit yourself to such a program.
__________________
--------------
Ethan
ETHANol is offline   Reply With Quote
Old 10-11-2011, 06:52 AM   #5
ETHANol
Senior Member
 
Location: Western Australia

Join Date: Feb 2010
Posts: 308
Default

There are some guides around here if you look. You do have a point, it would be good if these more informative threads lived in a special place where they didn't get lost in threads about an error someone got running some program.

http://seqanswers.com/forums/showthread.php?t=7068

http://seqanswers.com/forums/showthread.php?t=14038
__________________
--------------
Ethan
ETHANol is offline   Reply With Quote
Old 10-11-2011, 09:18 AM   #6
Joann
Senior Member
 
Location: Woodbridge CT

Join Date: Oct 2008
Posts: 231
Default Kudos to Simon Anders and, Welcome Kashliks!

Simon Anders' is a very important point and needs to be thoroughly understood. While a sequence is a sequence, the role of bioinfomatics in Next Gen sequence analysis is inseparable from the rest of the experiment; it makes or breaks its credibility. Biologists will soon be choking on sequence data (if not already doing so) so dry lab based scientists must be allowed to contribute their innovations within this research endeavor, more so at the leading edges.

An introductory wet lab scientist may apply some generalized HTS methodology to obtain a result, but ought not interpret novelty beyond the limitations of the methods employed. This cannot happen unless the system limitations are appreciated and this is where the bioinfo side is critical.

It's really a new species of collaboration.
Joann is offline   Reply With Quote
Old 10-11-2011, 09:46 AM   #7
westerman
Rick Westerman
 
Location: Purdue University, Indiana, USA

Join Date: Jun 2008
Posts: 1,104
Default

Going back to the original post: The wiki, for all of its weaknesses, is the place to look:

http://seqanswers.com/wiki/How-to

Contribute to it as you can. The wiki is sort of cookbook-like but, as Anders implied, doing analysis via cookbook is limited.
westerman is offline   Reply With Quote
Old 10-11-2011, 10:10 AM   #8
biznatch
Senior Member
 
Location: Canada

Join Date: Nov 2010
Posts: 124
Default

Partek has a "Step 1., Step 2., Step 3..." kind of interface. I haven't really used it, just been to info sessions for it. I've only used the various command line/open source programs, but our genomics core has a copy and a number of labs use it. Besides cost, what would be the main drawbacks of using Partek?
biznatch is offline   Reply With Quote
Old 10-11-2011, 10:58 AM   #9
ETHANol
Senior Member
 
Location: Western Australia

Join Date: Feb 2010
Posts: 308
Default

I wish I could add Partek to my options. As biologist it sounds easy. But then again I tried CLC when I was getting going and didn't really like it, I'm sure it is better now. But I wouldn't want to spend the money and be limited to Partek or CLC. The more tools you have in your toolbox that you know how to use the easier it will be and the better you'll be able to answer the questions you want to ask. And having more tools will probably even help you ask better and more interesting questions.

In response to what Joann says, I think the bench scientist that doesn't learn how to analyze their next-gen sequencing data is destined to the back seat of scientific discovery. It's not 1999 anymore when bioinformaticians were often more like support staff whose authorship on papers was somewhere in the middle of the list. There is real innovation and discovery in the analysis of data going on today. It use to be more like bioINFORMATICIAN now it's more like BIOinformatician. There are real biologists that work with computers now.
__________________
--------------
Ethan
ETHANol is offline   Reply With Quote
Old 10-13-2011, 12:40 AM   #10
Kashliks
Junior Member
 
Location: Latvia

Join Date: Oct 2011
Posts: 6
Default

I truthfully thank everyone for input.
And especially for links.
And totally agree that bioinformaticians in data stream of nawadays are key players to biodiscovery.
Kashliks is offline   Reply With Quote
Old 10-15-2011, 01:10 PM   #11
Nomijill
Member
 
Location: Southwest Florida

Join Date: Sep 2009
Posts: 24
Default

Hi Kashliks,

If you want to try the trial license for CLC bio's Genomics Workbench, it includes access to a variety of tutorials that are laced with references to foundation publications. It would be a nice orientation for you.

Best of luck with your analysis and research.

Naomi
Nomijill is offline   Reply With Quote
Old 10-17-2011, 04:43 PM   #12
phoss
Member
 
Location: Beltsville, MD

Join Date: Aug 2011
Posts: 12
Default

Hi Kashliks,
We were all beginners in the world of NGS at some point

My 2c: I'd encourage you to see what the goal of NGS is and why it would fit your project goals. Like Simon Anders and others noted earlier: technologies come-and-go, just like computer languages. If you learn what they all have in common, you can easily adapt and mold to newer techs.
Like Simon Anders noted above: it is increasingly complex and 'quick start' guides may not cover the groundwork to execute strong, solid research. Keep up with new manuscripts, tools, conferences and all this combined will prove powerful.
Best,
phoss is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:03 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO