Talk:Publication/Paper (NAR 2012)

From SEQwiki
Jump to: navigation, search

August 15th 2011 deadline!

Time to deadline Countdown to 15th Aug 2011 Oxford UK time
(1) Write/link focus review (maybe find something already discussed in forum! still fine.
(2) Finalize the content of paper
(3) Polish the language and tone

I suggest to have a conversation on msn messenger or skype --Marcowanger 10:27, 8 August 2011 (PDT)

I just list who contributed to this writing so far. marco : [email protected], time-zone (GMT+8)

Will begin to email the above persons not later than 12hours later.


Popularity of software

I noticed someone (probably Dan) has update the method of measuring popularity. I will fix the main text tonight Good. We have one less future direction now!
I will think what can we further improve. --Marcowanger 04:00, 10 August 2011 (PDT)

Wikipedia issue

I received the following message when I edit wikipedia --Marcowanger 09:05, 4 August 2011 (PDT)
The page SEQanswers has been speedily deleted from Wikipedia. This has been done under the criteria for speedy deletion, because the page appeared to be blatant advertising which only promotes something, and which is unlikely to be suitable for an article (or at best would need a fundamental rewrite). Wikipedia is not a medium for promotion of anything, whether a company, product, group, service, person, religious or political belief, or anything else. Please read the general criteria for speedy deletion, particularly item G11, as well as the guidelines on spam. Feel free to leave a note on my talk page if you have any questions about this. NawlinWiki (talk) 16:03, 4 August 2011 (UTC)

This is my reply to NawlinWiki

  • SEQanswers page on wikipedia is NOT an advertisement or promotion page. SEQanswers is a community maintained, freely accessible forum for bioinformaticians in life science field. The forum does not require any fee for registration. SEQanswers ( is cited multiple times in Nature ( and other top scientific journals such as PLoS. Please allow me to create the page. Thanks. Marcowanger — Preceding unsigned comment added by Marcowanger (talk • contribs) 16:10, 4 August 2011 (UTC)
  • The article was written as if it was on SEQanswers' own website -- "We hope to become the central location for next generation sequencing technology" etc. Please review WP:NPOV. Also, you need to find reliable independent sources (see WP:V) that establish that this site is notable per WP:WEB. NawlinWiki (talk) 16:12, 4 August 2011 (UTC)
  • Understood. I just created the page for edit later. In view of this, will write the content with reference before upload. Thanks. Marcowanger

I recreated the page as follows--Marcowanger 09:43, 4 August 2011 (PDT)

This page is the draft page on SEQanswers and SEQanswers wiki by one of the independent user of the forum. This page does not promote or advertise SEQanswers. I intend to write concise descriptions about SEQanswers. Formatting issue will be fixed ASAP.

SEQanswers and SEQanswers wiki is a community maintained forum dedicated for Next Generation Sequencing communities.

The SEQanswers ( was founded in 2007 to bridge the gap among the static peer-review publications and dynamic interactions between the packages' users and developers of next generation sequencing technologies; it facilitates rapid dissemination of both wet-lab techniques and information regarding computational tools and analyses. The forum allows new tools, techniques and pipelines to be rapidly announced, tested and benchmarked within the active scientific community.

Within two years, the SEQanswers has been cited multiple times by top scientific journals, including Nature and PLoS.

Year Publication Lead Author Title PubMed Publication Link
2008 Nature Biotechnology Shendure Next-generation DNA sequencing
2009 Briefings in Bioinformatics Horner Bioinformatics approaches for genomics and post genomics applications of next-generation sequencing
2009 Nature Biotechnology Trapnell How to map billions of short reads onto genomes.
2011 Nature Reviews in Genetics Nielsen Genotype and SNP calling from next-generation sequencing data.
2011 Nature Methods Perkel Coding your way out of a problem.

some references omitted here

  • OK, the wikipedia page is pulled down again --Marcowanger 13:04, 4 August 2011 (PDT)
  • I suggest we copy the template from wikipedia's GENBANK page, modify the content and resubmit to wikipedia.


1a> I think we should compare SeqWiki to Wikigenes

1b> Maybe we can relate SEQwiki to the idea of PLoS ONE, while they vision to provide a community dialog to measure individual paper's meric, SEQwiki can provide similar effect, and can be much faster when work together with SEQanswer forum. Most importantly the user base here is already very large!

1c> Community driven curation/forum already wide-spread. Some prominent examples include OpenWetware (Widely used in iGEM), BioTechnique forum, and Biostar (yet another great discussion hotspot)

2> One of the potential figure could be a plot of surge of bioinformatic tools over these two years, possibly categorize by type of application, e.g. genome, transcriptome, etc.. (see, the 1st figure)

3> Wikification of Genbank faced resistance, but it's about archive of sequence and annotation, which should not be taken lightly, but SEQwiki is a semantic wiki that serves as rapid reference for bioinformaticians, speed overpass accuracy. And reader can always refer to the paper of respective tools, and meta-analysis of tools easily done or obtained from Seqanswer communities. (


Future direction of SEQwiki

(1) Shall we add a community driven review system on respective tool? My person comment is that very often bioinformatic analysts are faced with hypes of every new proprietary tools with FANCY names but at the end of day, the functions are limited and not applicable to real work.


I suggest changing one of the "unprecedented" to "unique" in the first sentence of the abstract. --Mmartin 05:06, 4 August 2011 (PDT)

Before seeing your comment, I have changed the second "unprecedented" to "remarkable". i.e. remarkable challenges. If an unique is to replace anyone of the two "unprecedented", which one do you think should be replaced? --Marcowanger 07:48, 4 August 2011 (PDT)

"Remarkable" is ok, just wanted to eliminate the duplication. --Mmartin 07:53, 4 August 2011 (PDT)


The wiki refers to itself as "SEQanswers wiki", according to the main page. The term "SEQwiki" is only used on Publication/Poster (AGBT 2010). If no one objects, I'll change it in the text. --Mmartin 05:20, 4 August 2011 (PDT)

I concur. Using SEQanswers and SEQwiki makes people feel they are two separated thing. In fact, they are both under SEQanswers. --Marcowanger 07:50, 4 August 2011 (PDT)

Great, I'll change the text. --Mmartin 07:53, 4 August 2011 (PDT)
And in fact, I think more people know about SEQanswers. When you say SEQanswers wiki, they already know what it should be. SEQwiki may seems a bit odd.--Marcowanger 07:55, 4 August 2011 (PDT)
Why not name it in full SEQanswers wiki first and abbreviate it as SEQwiki? SEQanswers wiki, if repeatedly appear in text, may seems dummy. --Marcowanger 07:57, 4 August 2011 (PDT)
Hm, that would still make it seem as if "SEQwiki" was an official name for it. Perhaps simply "wiki" is enough, if you want to avoid the repetition. Also, maybe its a good idea to say "database" in at least some places. Although that is also not how we would usually call it, this is still the NAR database issue. --Mmartin 08:17, 4 August 2011 (PDT)
You are right --Marcowanger 08:26, 4 August 2011 (PDT)

To Do

  • Clean up the wiki (to reduce clutter in the screenshot)
Do you mean clean up the wiki (make better formatting in the wiki), then make another more concise screenshot?--Marcowanger 07:52, 4 August 2011 (PDT)
Yes! --Mmartin 07:54, 4 August 2011 (PDT)
  • Get the "Dummy application" removed from the list on the main page
Change it to ""Enter the package name", something like that --Marcowanger 07:52, 4 August 2011 (PDT)
I contacted one of the admins about that.--Mmartin 07:54, 4 August 2011 (PDT)
Thanks --Marcowanger 07:58, 4 August 2011 (PDT)
Didn't get a response, but it's gone now. --Mmartin 07:49, 8 August 2011 (PDT)
  • Fix the Internal server 500 issue
  • QC the whole Wiki, may need to distribute the work
    • Should we just go alphabetically? I had a quick look at the all-purpose genome assemblers already but a general QC would be a good idea --Usad 11:24, 7 August 2011 (PDT)


"Choices without clear attributes are mentally exhaustive rather than productive"

I think this could cite Barry Schwartz' "The paradox of choice", but I suppose that is not a peer-reviewed publication. --Mmartin 08:41, 4 August 2011 (PDT)

Book is okay, but surely we would prefer a peer review journal? In fact I am asking one of my friend (studying PhD psychology) to help me find a decent paper in top psychology journal on this. I need one to two days time. He is on vacation now. --Marcowanger 08:44, 4 August 2011 (PDT)

Do you have friends in Psychology field? We may need some experts to help cite reputable papers to claim "more choices may not be good" --Marcowanger 08:51, 4 August 2011 (PDT) I glanced a review paper today on "choices", the paper says "choices may or may not exhaust mind", it just depends on the stance of the researcher. So we ought to have some expert on this issue. --Marcowanger 08:51, 4 August 2011 (PDT)

Now that I'm thinking about it, perhaps this needs to be reformulated: The current argument is that it's not good to have a lot of choice, but then there's also the sentence about how there are over 400 entries for tools in the wiki.
The argument is that choices without clear attributes are worthless. That is one of the reasons this wiki is meant for. To classify, to provide a searchable platform for many tools. So personally I think the content is fine. Maybe the language and linking need to be more clear and concise.--Marcowanger 10:39, 4 August 2011 (PDT)
  • Found an appropriate paper for "choices, motivation and productivity". Inserted in main text.--Marcowanger 08:22, 9 August 2011 (PDT)

Things to add

Do we still have any other big ideas we want to convey in this paper?--Marcowanger 08:45, 4 August 2011 (PDT)
Have we missed any critical thing?--Marcowanger 08:46, 4 August 2011 (PDT)
I'll read the entire text and and will hopefully have some comments tomorrow. --Mmartin 10:23, 4 August 2011 (PDT)
Really thanks. --Marcowanger 10:40, 4 August 2011 (PDT)
Discuss briefly about the differences in role of Primary database(Genbank), and Wiki (meta database)
Is there any preferred section where we should add the micro comparisons, how-tos? I reckon content, as hardly anything is written there yet ? --Usad 11:22, 7 August 2011 (PDT)
It would go to content. It would be nice to have something there. --Marcowanger 10:22, 8 August 2011 (PDT)

Abstract ('pre-submission' enquiry to NAR)

(moved from the main page --Mmartin 05:48, 5 August 2011 (PDT))
Note: This abstract was submitted as the 'pre-submission' enquiry to NAR. On the basis of this, a full submission was invited. Perhaps this can form the outline of the full article?

In recent years, dramatic advances in sequencing technology have created unprecedented opportunities for biological discovery. At the same time, however, this rapidly advancing and complex field has created unprecedented challenges for data management and analysis. As a consequence, the development of scientific software and computational methods in the field is outpacing the speed of peer-reviewed publication and other traditional forms of information sharing.

The SEQanswers forum ( was founded to address this gap; it facilitates the rapid dissemination of both wet-lab techniques and information regarding computational tools and analysis. The forum allows new tools, techniques and pipelines to be rapidly announced, tested and benchmarked within the community.

SEQwiki ( is a Semantic MediWiki (SMW) site that is edited and updated by the members of the SEQanswers community. The wiki provides an extensive archive of categorized high-throughput sequencing analysis tools, technologies and providers.

Wiki pages provide structured data for each tool, including data types and formats, capabilities, and provenance details as well as catalogues of links to publications and online resources. Users contribute both structured data and free text comments using a combination of standard wiki and SMW data entry. A search tool provides a simple means for finding information about particular tools, and structured data can be queried and presented as reports within the wiki.

After two years, the SEQanswers community have created pages for over 400 unique software tools, with around 350 references and 500 web links. This effort has made the SEQwiki database the most comprehensive and detailed archive of high-throughput sequencing tools anywhere on the web. This community databases is an invaluable resource for the high-throughput sequencing.


(moved from main page)

SEQanswers Wiki: Community Curated Database for Next Generation Genomics --Marcowanger 10:11, 4 August 2011 (PDT)

NAR suggets the database name to be the first word of title Database
Central Platform

my suggestions:

SEQanswers Wiki: A Community-Curated Database of Software for the Analysis of High-Throughput Sequencing Data (long)

SEQanswers Wiki: A Database of Software for the Analysis of High-Throughput Sequencing Data (shorter, the name "Wiki" implies that it's community-edited)

SEQanswers Wiki: A Catalogue of Tools for the Analysis of High-Throughput Sequencing Data --Mmartin 05:57, 5 August 2011 (PDT)

Since some packages are not about "Sequencing", I suggest to use "High Throughout Data" only--Marcowanger 22:58, 5 August 2011 (PDT)

You have a point, but "high throughput data" is probably too generic since there are lots of non-sequencing fields in which people use that term (e.g. particle physics, mass spectrometry). And the wiki itself does have the "seq" in its name! Perhaps the term "sequencing-related" is more correct? --Mmartin 03:01, 8 August 2011 (PDT)
"sequencing-related" is more concise, but we have to include also genotyping related or microarray tools, if we still care. --Marcowanger 10:19, 8 August 2011 (PDT)

Removed from main text

This databases is an invaluable resource for the scientific community.


I have read most of the text and here are some of my comments. I'm willing to make some of the suggested changes myself, but I won't be able to work on this over the weekend.

  • The abstract and the text itself should both stand on their own. It's ok when the introduction simply repeats some of the sentences from the abstract.

I think the abstract is too long, some of it should be moved to the introduction, while the introduction assumes that the abstract has been read.

I suggest either high-throughput sequencing or second-generation sequencing.

I do agree NGS is kind of an old term now. I prefer "High throughput sequencing", if a change is needed. This make this wiki sustainable. And in fact some softwares are for PacBio, the third generation. We should not limit ourselves to only "second". --Marcowanger 23:02, 5 August 2011 (PDT)
NGS in the text is now HTS, I will update the Image of HTS provider later. We will need to update the actual Wiki page for "Next Generation Sequencing Providers" to "HTS providers", and correct the terms elsewhere. --Marcowanger 23:15, 5 August 2011 (PDT)
Great! I also like HTS better and agree that it should be changed in the wiki. --Mmartin 03:02, 8 August 2011 (PDT)
  • There are lots of buzzwords in the text. Some parts read more like an advertisement. To a degree, this is ok, but some sentences could be "toned down".
  • More future directions:
    • automate software entry
    • Move the sentence about "implement a more reliable metric to measure popularity" here
moved.--Marcowanger 23:06, 5 August 2011 (PDT)
    • (perhaps: simplify/improve the interface)
  • Perhaps add a table with statistics? No. of visits/month, total no. of articles.
May Dan or ECO kindly provide the data or point us to a page for it? Thanks. --Marcowanger 23:16, 5 August 2011 (PDT)

Overall, I think most of the content is there, perhaps even a bit more than necessary. I also think, however, that the text needs a little more structure. In my opinion, every section should answer a single question and every paragraph should contain a single statement.

I do agree. Before I only write what comes to my mind, just not to leave out the message to convey. To improve, we need better topic sentence. Maybe ask a question, then answer it, by providing supporting evidence.--Marcowanger 23:23, 5 August 2011 (PDT)

NAR is an English journal, it is better to write in British english. --Marcowanger 12:04, 8 August 2011 (PDT)

Important: Give supporting evidences for major claims, for example
Packages were announced before publication
Ray (de-novo assembler) would be another example --Usad 06:32, 10 August 2011 (PDT)
Packages were actively discussed and compared after publication
How users are benefited by community discussion

I suggest add screenshots and email users for their EMAIL feedback, ask for recommendation, or whatever. Well, I am not sure if these can be added in SUPPLEMENTARY FILE. are they official enough?--Marcowanger 19:19, 8 August 2011 (PDT)

I am unsure whether we have to go all the way. But a good thing could be to just add that as a special site on our wiki site and reference individual software packages from there. That would also
E.g. Ray

Potentially FastQC (when) was that published?

Maybe it would also be a good idea to link the seqanswers forum thread in the wiki? (currently there is a search function only)

I concur, it is a good way to harness the power of SEQanswers forum and avoid duplication of effort.

Description in accordance with BioDBcore standards

Moved from main text
See some examples.--Marcowanger 08:46, 9 August 2011 (PDT)

Future Directions




Personal tools

wiki navigation
vBSSO Login Form

Reset Password
Single Sign On provided by vBSSO