SEQanswers

Go Back   SEQanswers > General



Similar Threads
Thread Thread Starter Forum Replies Last Post
Development of Index : project of a desperate student Minouche Illumina/Solexa 2 05-10-2014 08:07 AM
newbie desperate and confused charsonic_wu General 3 09-28-2011 01:45 AM

Reply
 
Thread Tools
Old 07-08-2014, 05:38 PM   #1
seqador
Junior Member
 
Location: Brazil

Join Date: Aug 2013
Posts: 9
Exclamation Little desperate and alone (Help Me)

Hello world!
Well, I am a master degree student and need your help. I think this forum will help to me get more help in the web.
I did an experiment and extract total RNA of mice heart. I sent this material to BGI sequence in Illumina Hiseq for 10M reads.
I was studying the pipeline for what to do with the raw sequence until the visualization but I do not know anything about the computer skills I needed to perform each step. For example:
I have a raw sequence and I need to perform trimming and after that align with genome reference but what is the commands in linux and the informatics.
I would appreciate a lot references, books, sites, articles, little help.
Thank you a lot since now and sorry for my poor english.
seqador is offline   Reply With Quote
Old 07-08-2014, 07:52 PM   #2
N311V
Member
 
Location: Australia

Join Date: Jul 2013
Posts: 34
Default

Maybe start here but remember this is only guide. The appropriate workflow and analysis may differ for your data.

http://www.broadinstitute.org/gatk/g...ices?bpm=index
N311V is offline   Reply With Quote
Old 07-08-2014, 08:03 PM   #3
seqador
Junior Member
 
Location: Brazil

Join Date: Aug 2013
Posts: 9
Default

Quote:
Originally Posted by N311V View Post
Maybe start here but remember this is only guide. The appropriate workflow and analysis may differ for your data.

http://www.broadinstitute.org/gatk/g...ices?bpm=index
Thank you very much.
seqador is offline   Reply With Quote
Old 07-08-2014, 08:04 PM   #4
blancha
Senior Member
 
Location: Montreal

Join Date: May 2013
Posts: 367
Default

I've never used it myself, but the Galaxy project may be the best answer for you.
They offer a web-based interface to all these command line tools.
I don't know that the wait time is on their public server though.
https://usegalaxy.org

I would not run the analysis on a personal desktop, unless you have a lot of RAM and hard drive space.
To run the analysis on your own Linux server, you'll need to install the following tools.
Trimming: Trimmomatic
Alignment: TopHat
Gene quantification: Cufflinks (or htseq-count and DESeq2, but these tools are a bit harder to use).

You'll also need the genome and its annotation.
I would recommend downloading the iGenome for the mouse.

You can easily find all these resources by googling them.

If the wait time on the Galaxy public server is too long, you're probably better off finding a bioinformatician with access to a Unix server to help you.

I suppose it is your supervisor's idea that you do the analysis yourself. Time and again, I've seen principal investigators overestimate the ability of their wet lab students to use the Unix command line, as well as overestimate their students' knowledge of the basic statistics required to understand the results of a differential expression analysis.
blancha is offline   Reply With Quote
Old 07-09-2014, 08:28 AM   #5
NextGenSeq
Senior Member
 
Location: USA

Join Date: Apr 2009
Posts: 482
Default

The simple way to proceed is to download a demo version of CLC Bio or NextGene.

10 million reads is not very many. A decent desktop computer can align this.
NextGenSeq is offline   Reply With Quote
Old 07-09-2014, 08:38 AM   #6
jwfoley
Senior Member
 
Location: Stanford

Join Date: Jun 2009
Posts: 179
Default

Quote:
I suppose it is your supervisor's idea that you do the analysis yourself. Time and again, I've seen principal investigators overestimate the ability of their wet lab students to use the Unix command line, as well as overestimate their students' knowledge of the basic statistics required to understand the results of a differential expression analysis.
Too true. "Computer literacy" is an excellent metaphor; if you're not already proficient with statistics and the Unix-like command line, then your PI's asking you to learn those things just to analyze one sequencing run is like asking you to learn Ancient Greek just to translate one document. I do encourage every scientist who works with large datasets to learn these things, but don't hold up your whole project for it. Even aside from the huge delay while you start your education from scratch, you're inevitably going to make mistakes and get wrong results (possibly without knowing it) the first time. At least get an expert to do the analysis for you and then go over her scripts to understand how they work.

Last edited by jwfoley; 07-09-2014 at 08:53 AM.
jwfoley is offline   Reply With Quote
Old 07-11-2014, 07:18 PM   #7
seqador
Junior Member
 
Location: Brazil

Join Date: Aug 2013
Posts: 9
Default

Thanks everybody that help me!
seqador is offline   Reply With Quote
Old 08-01-2014, 09:17 AM   #8
jwag
Member
 
Location: USA

Join Date: Apr 2013
Posts: 42
Default

You could check out Practical Computing for Biologists. Gives a pretty good intro into using command interfaces, setting up environments, etc.
jwag is offline   Reply With Quote
Old 08-02-2014, 06:36 AM   #9
Zapages
Member
 
Location: NJ

Join Date: Oct 2012
Posts: 94
Default

I would recommend iPlantcolloaborative.org as it a lot of useful tools and guides on what to do and how to do everything including visualization of the results.
Zapages is offline   Reply With Quote
Old 09-26-2014, 01:16 AM   #10
tomc
Member
 
Location: Oregon

Join Date: Feb 2011
Posts: 29
Default

Greg Wilson's Software Carpentry is designed to help people in your position.
Contact them, and convince your University to invite them down for a bootcamp
but in the meanwhile they have their teaching materials online

http://software-carpentry.org/lessons.html

n.b. I have no compeating interests, just respect for Greg's work.

Last edited by tomc; 09-26-2014 at 01:20 AM. Reason: add disclaimer
tomc is offline   Reply With Quote
Old 06-04-2015, 09:53 PM   #11
QazSeDc
Junior Member
 
Location: Hong Kong

Join Date: Jun 2015
Posts: 7
Default

Quote:
Originally Posted by jwfoley View Post
Too true. "Computer literacy" is an excellent metaphor; if you're not already proficient with statistics and the Unix-like command line, then your PI's asking you to learn those things just to analyze one sequencing run is like asking you to learn Ancient Greek just to translate one document. I do encourage every scientist who works with large datasets to learn these things, but don't hold up your whole project for it. Even aside from the huge delay while you start your education from scratch, you're inevitably going to make mistakes and get wrong results (possibly without knowing it) the first time. At least get an expert to do the analysis for you and then go over her scripts to understand how they work.
I agree with jwfoley. If you dont have any bioinformatics skills you better ask someone to do it for you at this moment. I think BGI does provide data analysis plans but of course you'll have to pay extra.
QazSeDc is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 02:53 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO