SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
RNA-Seq analysis using galaxy guzhi100 RNA Sequencing 3 07-17-2012 01:39 AM
RNA seq analysis in Galaxy Puva Bioinformatics 0 04-29-2011 12:31 PM
Free Partek RNA-Seq & ChIP-Seq Data Analysis Workshops [email protected] Events / Conferences 0 04-05-2011 08:28 AM
Free & Open Environment for NGS analysis: Galaxy (http://usegalaxy.org) nekrut Bioinformatics 36 05-06-2010 05:33 AM
RNA-Seq & SNP/INDEL Analysis of Illumina GA Reads gavin.oliver General 1 03-10-2010 10:40 AM

Reply
 
Thread Tools
Old 11-29-2010, 02:06 PM   #1
jgoecks
Member
 
Location: Washington, D.C.

Join Date: Jan 2010
Posts: 28
Default Free & Open Environment for RNA-seq analysis: Galaxy (http://usegalaxy.org)

The Galaxy Team is excited to announce that the first free public resource for RNA-seq analysis is now available through the Galaxy public server at http://usegalaxy.org

Galaxy now supports both Tophat and Cufflinks and also provides useful utilities for manipulating and visualizing GTF files, which are common outputs for a Tophat-Cufflinks pipeline.

Here is an exercise for learning about how to use Galaxy for RNA-seq analysis.

This addition brings Galaxy's current NGS offerings to:

1. NGS QC and manipulation - contains a variety of tools for dealing with all flavors of fastq datasets as well as outputs of SOLiD and 454 instruments.
2. NGS Mapping - currently includes bowtie (Illumina & SOLiD), BWA (Illumina), and lastz (454) mappers. PerM (SOLiD) is on the way and more will be added in the coming months. Transcriptome tools (e.g., top-hat) are also in the final stages of development.
3. NGS SAMTools - includes a variety of utilities for SAM/BAM manipulation. Some are based on the samtools library, some are written by the Galaxy team.
4. NGS RNA-seq tools - includes Tophat, Cufflinks, and useful utilities for manipulating and viewing GTF files.

Galaxy is an open and free web-based platform for performing accessible, reproducible, and transparent NGS analyses. Users can start using Galaxy by going to http://usegalaxy.org ; alternatively, Galaxy can be downloaded and run on any *NIX machine: http://bitbucket.org/galaxy/galaxy-c...wiki/GetGalaxy or run on cloud computing resources such as Amazon: http://usegalaxy.org/cloud

Here is the previous SEQAnswers announcement about Galaxy's initial NGS offerings.

Enjoy and please send us feedback!

The Galaxy Team
jgoecks is offline   Reply With Quote
Old 12-16-2010, 12:13 AM   #2
honey
Senior Member
 
Location: Pittsburgh

Join Date: Feb 2010
Posts: 151
Default

I have problem while running RNA seq on Galaxy, I can not save Bam file (it saves as Bam index by default) or sam files. Secondly I am trying to find do you have any plan to integrate Deseq into Gakaxy or it is not necessary?
Thanks.
honey is offline   Reply With Quote
Old 12-16-2010, 06:21 AM   #3
jgoecks
Member
 
Location: Washington, D.C.

Join Date: Jan 2010
Posts: 28
Default

Hello,

(1) Clicking on the save icon (the disk) rather than the arrow will download the BAM file rather than the index. (This is a recent UI bug, and we've fixed it in our codebase; you'll see this fix when update our main server.)

(2) I'm not sure why you wouldn't be able to save SAM files -- perhaps the size is very large and your browser times out or you're not waiting long enough for the file to download? Can you provide more details about the problem that you're having?

(3) DESeq could certainly be integrated into Galaxy, but we--the Galaxy team--are not currently working on it. Galaxy has many R-based tools already available and we both welcome and try to support submissions from the community for new tool wrappers.

Finally, Galaxy usage issues/questions are best sent to either [email protected] or [email protected]. These lists go to the entire Galaxy team and, in the case of galaxy-user, to the user community, and you should be able to get help more easily/quickly when you post there.

Best,
J.
Galaxy Team
jgoecks is offline   Reply With Quote
Old 04-30-2011, 10:49 PM   #4
neoanderson
Junior Member
 
Location: australia

Join Date: Apr 2010
Posts: 6
Default

Im not sure if this is the best place to post this...but here goes...
we have recently obtained an rna-seq dataset to get differential expression lists from.
being new to this, I evaluated the galaxy platform and I found it very useful and interesting. the QC and mapping programs in galaxy have been used to obtain bam/sam mapped files. I recently stumbled across the rquant package for galaxy but am unable to install it. I have also downloaded the bam files from the galaxy server. I am trying hard to understand how to proceed from having these bam files to actually obtaining lists of up or down regulated genes for the condition tested. thanks in anticipation
neoanderson is offline   Reply With Quote
Old 05-01-2011, 05:20 AM   #5
honey
Senior Member
 
Location: Pittsburgh

Join Date: Feb 2010
Posts: 151
Default question

Thanks for sharing your experience with Galaxy perhaps you may also like to mail the message to Galaxy users list. You have to follow the workflow of RNA-seq and have to run cufflink/ cuffdiff. The problem is I am not sure if you can really get to a point in Galaxy where you can get differential expression list of transcripts or genes or isoforms or splicing junctions. However you can certainly take these bam/sam files and run further analysis outside Galaxy. There is also a nice tutorial to work with RNA- seq data (search Galaxy users list). Jeremy Goecks may add in more information about differential expression if I am missing something.
honey is offline   Reply With Quote
Old 05-01-2011, 09:39 AM   #6
jgoecks
Member
 
Location: Washington, D.C.

Join Date: Jan 2010
Posts: 28
Default

@neo,

rQuant was developed by Gunnar Ratsch's Lab and is available via their public Galaxy instance at http://galaxy.tuebingen.mpg.de/ Questions should be directed towards them (Help menu --> Email questions) rather than the galaxy-user mailing list.

The galaxy-user mailing list (see my previous post) is the best place to ask questions about using Tophat/Cufflinks/compare/diff in Galaxy.

@honey,

Many users have gotten a functioning Tophat/Cufflinks/compare/diff pipeline working in Galaxy and have produced Cuffdiff quantitation and differential expression datasets. I think the Galaxy team has managed to address most of the big issues with this pipeline, but we're happy to help solve any particular problems that you may be having.

Best,
J.
jgoecks is offline   Reply With Quote
Old 05-01-2011, 03:17 PM   #7
neoanderson
Junior Member
 
Location: australia

Join Date: Apr 2010
Posts: 6
Default

@honey, @jgoecks
thank you for the quick reply. I really appreciate it.
I will write to rQuant developers about the issues. just that the login details for galaxy don't let us login to http://galaxy.tuebingen.mpg.de/ and also it would be so much more easier if the sam/bam files generated through read mapping with bowtie in galaxy were available in http://galaxy.tuebingen.mpg.de/
thank you both once again.
cheers,
Neo
neoanderson is offline   Reply With Quote
Old 05-02-2011, 06:30 AM   #8
jgoecks
Member
 
Location: Washington, D.C.

Join Date: Jan 2010
Posts: 28
Default

@neo

Galaxy makes it relatively easy to move files from one instance to another:

(1) for the dataset that you want to move, right click on the save (disk) icon next to the dataset and copy its URL;
(2) for the instance that you want to copy the dataset to, paste the URL into the upload form.

Galaxy will then copy the dataset from one instance to another without you having to download it to your local computer.

Complete histories can be imported and exported as well, but this functionality is still in beta.

Best,
J.
jgoecks is offline   Reply With Quote
Old 05-31-2011, 11:59 AM   #9
Sakti
Junior Member
 
Location: NY

Join Date: May 2011
Posts: 6
Default

Hi Jeremy,

This is an awesome tool. I'm new to RNA-seq and was getting dizzy by reading the tons of reports using different programs. I'm glad Galaxy simplified a fairly complicated analysis pipeline into such a simple one.

My only request would be if you could answer the questions you made in the tutorial as to know my results and your results are in accordance, and feel more secure by comparing my reasoning with the results one should get.

Thank you so much, this was very interesting.
Sakti is offline   Reply With Quote
Old 08-10-2011, 01:38 PM   #10
fangquan
Member
 
Location: Boston

Join Date: Jul 2011
Posts: 18
Default

Hi friend,
I have two questions.

(1) Is there a way to see the command lines ran behind Galaxy's web-interface?
(2) A few jobs are still waiting to run, if I shut down my PC. Is it still working?

Thanks,
Quan
fangquan is offline   Reply With Quote
Old 08-10-2011, 02:07 PM   #11
jgoecks
Member
 
Location: Washington, D.C.

Join Date: Jan 2010
Posts: 28
Default

Quan,

The answers to your questions depend on whether you're using our public server or running Galaxy locally/on the cloud.

If you're using our public server (main.g2.bx.psu.edu):

(1) you cannot see the command lines run by Galaxy;
(2) waiting jobs will be run even if you turn off your computer.

If you're running locally/on the cloud:

(1) you can see the command lines by viewing Galaxy's logs;
(2) waiting jobs will not be run unless your Galaxy is running.

Questions like this are best directed to one of our mailing list: http://wiki.g2.bx.psu.edu/Support

Best,
J.
jgoecks is offline   Reply With Quote
Old 08-11-2011, 05:11 AM   #12
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,583
Default

Quote:
Originally Posted by jgoecks View Post
The Galaxy Team is excited to announce that the first free public resource for RNA-seq analysis is now available through the Galaxy public server at http://usegalaxy.org



The Galaxy Team
Jeremy,

I am not sure if it was advertised before but galaxy now has a disk quota for user files on the public instance (I understand the reason for the decision).

I learned this from a galaxy mailing list answer yesterday. I feel that this should be pointed out as a footnote for this post and mentioned on the main page of galaxy.

Thanks for the great work you all do!
GenoMax is online now   Reply With Quote
Old 08-16-2011, 05:46 AM   #13
jgoecks
Member
 
Location: Washington, D.C.

Join Date: Jan 2010
Posts: 28
Default

GenoMax,

Quotas on our public instance are a new feature (within the last couple weeks) and are being phased in slowly. Moreover, we're still in process of determining what the quotas should be; currently they are:

(a) 50 GB per dataset;
(b) 200 GB per history;
(c) 4 concurrent NGS jobs;

Once we've determined what these will be, I'll update my initial post and we'll ensure that this information is prominently featured on the public site.

Best,
J.
jgoecks is offline   Reply With Quote
Old 07-16-2012, 09:19 AM   #14
SilviaBCE
Junior Member
 
Location: rome

Join Date: Jun 2012
Posts: 3
Question Nucleotide bias in a specific position of the reads- Galaxy analysis

Hi, I'm analyzing my small-RNA-seq data (Illumina 1.9 quality score) and I'm using galaxy to make the preliminary qc tasks. I find it a great and easy tool! I'm here to ask you how can I interpretate a graph:I'm talking about the nucleotide distribution chart after the sample grooming and the 3' adapter trimming. I attach it here so anybody can see it. Up to now I've loaded two samples in galaxy and they both give me this kind of bias at the 3rd nucleotide of the reads. What does it mean? would you suggest to eliminate all those reads which contain the "N" in the 3rd position?
Any suggestion would be appreciated! Thanks a lot.
SilviaBCE is offline   Reply With Quote
Old 08-13-2012, 06:02 AM   #15
weijenc
Junior Member
 
Location: NY, USA

Join Date: Aug 2012
Posts: 7
Default Trimming Paired-End Data

Hello,

So if I use quality value < 20 to trim my Illumina dataset, which contains paired-end 100 bp sequencing reads, would both reads on the same pair be removed should one of them have a base quality < 20? What I worry is when I use the trimmed dataset to perform de novo assembly, would any program say that the dataset is not paired-end if both reads are not removed at the same time?

Thanks,


WJ
weijenc is offline   Reply With Quote
Old 08-13-2012, 02:59 PM   #16
jgoecks
Member
 
Location: Washington, D.C.

Join Date: Jan 2010
Posts: 28
Default Trimming PE reads in Galaxy

@weijenc

My suggestion for trimming paired-end reads in Galaxy is:

(1) Join them using the FASTQ joiner;
(2) Filter them using the Filter FASTQ tool;
(3) Split them using the FASTQ splitter.

Best,
J.
jgoecks is offline   Reply With Quote
Old 08-16-2012, 08:35 PM   #17
weijenc
Junior Member
 
Location: NY, USA

Join Date: Aug 2012
Posts: 7
Default Problem in grooming

Thanks for the reply to my previous post.

I have been trying to work with the paired-end dataset (SRR131208, two files). After grooming (solexa to fastq sanger), however, quality values are between 5 and 0. Did I do something wrong?

Thanks,


WJ
weijenc is offline   Reply With Quote
Old 08-17-2012, 09:41 AM   #18
jgoecks
Member
 
Location: Washington, D.C.

Join Date: Jan 2010
Posts: 28
Default

Your data is almost certainly not solexa format; most newer Illumina data is already fastqsanger, in which case the groomer is not needed.

See the Wikipedia entry for FASTQ for more details:

http://en.wikipedia.org/wiki/FASTQ_format

You should be able to look at the first few reads of your datasets to determine the FASTQ format.

Best,
J.
jgoecks is offline   Reply With Quote
Old 11-18-2012, 11:22 AM   #19
Peppe
Member
 
Location: USA

Join Date: Nov 2012
Posts: 11
Default

Hi all,
I am new in the forum and also in the RNA seq analysis field.
I just a got the results of my RNA sequencing and I am trying to map my reads using tophat on galaxy. I am working with Windows 7. After the FastQC analysis, I converted my reads with FASTQ Groomer and then I run tophat. It has been 2 days and the process hasn't started yet.

Does tophat (in galaxy) run on windows7?
Usually how long does it take a mapping analysis (about 20 Mb the size of the genome reference)?

Thanks
Peppe is offline   Reply With Quote
Old 11-19-2012, 10:40 AM   #20
jgoecks
Member
 
Location: Washington, D.C.

Join Date: Jan 2010
Posts: 28
Default

@Peppe

I assume you're using the public Galaxy server at https://main.g2.bx.psu.edu/ , yes?

If so:

(a) Galaxy will work fine on Windows, though you'll want to you Firefox or Chrome as your Web browser going forward so that you can use all of Galaxy's functionality.

(b) The server is very busy right now, so it may take a couple days for your job to start. Do not restart the job or it will go to the end of the wait list. Once your job starts, it should go quickly (4-8 hours is a good estimate) because your genome is small.

Best,
J.
jgoecks is offline   Reply With Quote
Reply

Tags
cuffcompare, cuffdiff, cufflinks, galaxy, tophat

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:51 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO