BACKGROUND:
As most of you are aware, three large, international sequence repositories accept submissions of DNA sequence data: DDBJ [1], ENA Sequence [2] and GenBank [3]. These three databases are organized under the umbrella of the INSDC, periodically synchronize their stored data, and strive to keep the data as ubiquitously accessible as possible.
QUESTION:
Except for idiosyncrasies in their data submission routes, there should be little, if any, reason for preferentially submitting sequence data to one database over another. Yet many researchers display very noticeable preferences! However, when asked point blank, many colleagues cannot provide a technical reason for their behavior of consistently submitting to a specific database.
Maybe researchers base their preference on geographic vicinity ("buy local"), but maybe there are genuine technical reasons for their preference. Can you think of any technical reason - however minute or seemingly insignificant - in preferentially submitting DNA sequences to NCBI's *GenBank* over *ENA Sequence* (or vice versa)? For example, there may be differences in data storage or accessibility that are relevant to you.
Put differently: Short of flipping a coin, why would you (as a bioinformatics-prone end user) select one database over the other for your data submission?
Edit 1: Respondents attempt to reassure me that I can submit my DNA sequences to either database and that "the data will be just fine there". Having submitted data to both GenBank and ENA Sequence for 10+ years myself, this is not what this question is about. Instead, this question is about carving out genuine technical differences from a user perspective.
REFERENCES:
[1] https://www.ddbj.nig.ac.jp/index-e.html
[2] https://www.ebi.ac.uk/ena
[3] https://www.ncbi.nlm.nih.gov/genbank/
As most of you are aware, three large, international sequence repositories accept submissions of DNA sequence data: DDBJ [1], ENA Sequence [2] and GenBank [3]. These three databases are organized under the umbrella of the INSDC, periodically synchronize their stored data, and strive to keep the data as ubiquitously accessible as possible.
QUESTION:
Except for idiosyncrasies in their data submission routes, there should be little, if any, reason for preferentially submitting sequence data to one database over another. Yet many researchers display very noticeable preferences! However, when asked point blank, many colleagues cannot provide a technical reason for their behavior of consistently submitting to a specific database.
Maybe researchers base their preference on geographic vicinity ("buy local"), but maybe there are genuine technical reasons for their preference. Can you think of any technical reason - however minute or seemingly insignificant - in preferentially submitting DNA sequences to NCBI's *GenBank* over *ENA Sequence* (or vice versa)? For example, there may be differences in data storage or accessibility that are relevant to you.
Put differently: Short of flipping a coin, why would you (as a bioinformatics-prone end user) select one database over the other for your data submission?
Edit 1: Respondents attempt to reassure me that I can submit my DNA sequences to either database and that "the data will be just fine there". Having submitted data to both GenBank and ENA Sequence for 10+ years myself, this is not what this question is about. Instead, this question is about carving out genuine technical differences from a user perspective.
REFERENCES:
[1] https://www.ddbj.nig.ac.jp/index-e.html
[2] https://www.ebi.ac.uk/ena
[3] https://www.ncbi.nlm.nih.gov/genbank/
Comment