Seqanswers Leaderboard Ad

**Nicolas** · 11-28-2012, 04:39 PM

Looking forward to seeing the results.
It may be too late, but what would also be valuable would be to survey the source of annotations people are using (RefSeq, Gencode, and so on). There are good reasons to use one or the other, but I would be curious to see the results.

**bodhisattvax** · 11-29-2012, 12:24 AM

Thanks to everyone who has responded so far!
Please keep responding ....
Nicolas - that is a great suggestion and I've included your question in the survey. Hopefully it is not too late and lots more people will answer the survey

**syfo** · 11-29-2012, 12:54 AM

Originally posted by Nicolas View Post

Looking forward to seeing the results.

Me too!

Originally posted by Nicolas View Post

It may be too late, but what would also be valuable would be to survey the source of annotations people are using (RefSeq, Gencode, and so on). There are good reasons to use one or the other, but I would be curious to see the results.

"Custom" annotations would have been nice too (import your own).

**bodhisattvax** · 11-29-2012, 11:06 PM

Thanks to everyone who has answered the survey so far - already some trends are becoming evident!

However for the results to be accurate and representative we need more respondents. So I urge any one who hasn't yet answered the survey to please do so - it really is a very short survey!

**RickBioinf** · 11-30-2012, 12:40 AM

For more respondents maybe also try: http://www.reddit.com/r/bioinformatics

**bodhisattvax** · 11-30-2012, 03:00 AM

Thanks Rick!
Have also put in on BioStar

**bodhisattvax** · 12-03-2012, 02:01 AM

A quick update:
Again, thanks for all the responses so far.

I think I'm pretty satisfied with the number of responses and will start to collate the results and generate a report which I will share with everyone.

This is more non-trivial than I had initially thought as there doesn't seem to be an easy way to get the responses off SurveyMonkey without paying them for it. But I hope to have all this done over the next 2-3 days.

Meanwhile, if anyone else would like to complete the survey please feel free to do so!
Cheers

**ramma** · 12-03-2012, 03:26 PM

Great idea! Building a 'standard' pipeline is an idea I've toyed with for a while myself. The furthest I've made it is writing sets of scripts that work for all types of data my lab receives. Simply execute the few scripts in order, and most everything is taken care of. The differential analysis part still needs to be implemented, but it's fairly easy to do as is.

**bodhisattvax** · 12-04-2012, 11:00 AM

The results!!!!!

Hi all

I've finally put together the results of the survey!

First of all, thanks to everyone who participated - the response has been great, with 93 people completing the survey as of today.

The respondents have been a varied bunch, including all levels of academia (pre-docs, grad-students, pot-docs and PIs), core bioinformaticians and bioinformatics managers, as well as many from the industry. The majority of respondents appear to be based in the US and Europe but also in China, Korea and Australia.

I provide below my own summary of the survey's findings, and I attach a document which contains all the results, including all unedited comments. As with any survey, we should probably be aware of potential biases (e.g. skews caused by people who are really annoyed with a particular tool!).

My inferences below are probably influenced by my own experiences, so feel free to rap my knuckles if you feel I am over-reaching my inferences or misinterpreted the data, and to air your doubts about the veracity and accuracy of the results and conclusions. I'd also like to declare here that I have no vested interests, have nothing to gain by promoting one tool over another, and have personally only used a small number of all the tools listed.

Now for the summary. Enjoy!

One of the take-home messages from the survey appears to be that the shadow of the Tuxedo Suite still looms large over the RNA-Seq analysis field. However there is a wide diversity of opinions and experiences, and many other tools appear to be in the ascendancy, especially when it comes to read-counting and differential expression analysis.

Q1. What do you prefer to align your reads to?

Most respondents align to the genome only (47.3%) , and this is closely followed by those who align to both genome and transcriptome (39.8%). Key to their choices has been the availability and reliability of data, as well as the question being asked in the experiment. Respondents who chose to align to the genome only appear to do so for various reasons such as the ability to discover new transcripts and splice variants. However many respondents have commented that aligning to both the genome and transcriptome offers several advantages, such as increased speed and accuracy. Thus , for a species, if both a reliable genome and transcriptome are available, this might be the optimal way forward.

Q2 and 3. What is your preferred aligner? And the reasons why.

Tophat rules the roost here, taking more than two-thirds of the vote (67.9%). Reasons for this include its ease of use, proven accuracy (which has improved over time), historical popularity, and that the alternatives available have not yet warranted a change from Tophat. Another Tuxedo suite aligner, Bowtie, comes in at a distant second (17.3%). STAR (6.2%) has been noted for its speed.

Q4 and 5. What is your preferred read-counting methodology? And the
reasons why.

Again, a Tuxedo suite tool, Cufflinks, took the majority of votes (57.1%). Reasons for this included its ease of use but many respondents appear to use this because it has been logical follow-on from using Tophat as per the Tuxedo workflow. The second-placed HTSeq-count appears to be in the ascendancy - many respondents appear to have been dissatisfied with Cufflinks and switched to HTSeq-count. This looks to be a good candidate to topple Cufflinks from the top in the near future. Other notable tools include easyRNASeq and RSEM. Also, many respondents use bedtools, samtools or in-house tools and custom scripts.

Q6 and 7. What is your preferred methodology to estimate differential expression? And the reasons why.

Finally, a non-Tuxedo suite tool wins the vote: DESeq/DEXSeq with 44.7%. CuffDiff is not too far behind (35.5%) and EdgeR (19.7%) brings up the rear. Going by the comments , we might expect usage of DESeq and EdgeR to increase as opposed to CuffDiff. Results from the latter have been variously described as weird, untrustworthy, having too many false positives and other problems.

Q8. Which annotation resource do you use?

Ensembl (46.6%) was the clear winner. Second and third places were closely contested between Refseq (25.9%) and UCSC(22.4%) respectively.

Q9. What software do you use for downstream analyses?

GOSeq (68.9%) is clearly very widely used. Many respondents also use the commercial options of Ingenuity IPA and Genego Metacore. DAVID was also an honourable mention.

P.S. Please note: the percentages quoted relate to the numbers of people who answered that particular question. This varies widely across questions, from all 93 respondents in the first question, to 45 for Q9. Please see attached file for all details

Attached Files

RNA-Seq survey.pdf (325.0 KB, 216 views)

**ramma** · 12-04-2012, 11:38 AM

Thanks for posting the responses. I'm glad you left all the comments in too.

**Nicolas** · 12-04-2012, 11:49 AM

I'd love to know how many people replied Cufflinks for quantif and DESeq or edgeR for DE analysis...

**pbluescript** · 12-04-2012, 12:28 PM

Originally posted by Nicolas View Post

I'd love to know how many people replied Cufflinks for quantif and DESeq or edgeR for DE analysis...

Might be a good way to QC the survey.

**bodhisattvax** · 12-05-2012, 01:47 AM

You're welcome ramma!

Nicolas - that's a good question; good enough for me to answer despite requiring having to go through the data manually :-)

So after doing a quick, rough count, it looks like there were ~38 people who use Cufflinks for read quant and provided a specific answer for DE methods. Of these, ~14 used DESeq/EdgeR; the majority of the rest: CuffDiff.

Interestingly I found at least two examples of people using HTSeq-count and then CuffDiff!

**jian_gao** · 06-12-2013, 09:06 AM

It is a good idea cool

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 25 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 24 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Survey: RNA-Seq analysis for Differential Gene/Transcript Expression

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News