Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Why is the growth rate of the SRA decreasing?

    Growth of the short read archive at EMBL appears to be plateauing:



    That is the doubling time is trending upwards:



    This is well below the doubling time for raw megabases/$ -- which is around 6 months.

    Is some other archive for raw data being used? Or is the raw data simply not being submitted to archives any longer?

    --
    Phillip

  • #2
    the scale is logarithmic

    Comment


    • #3
      People are hitting ISP data caps trying to upload data

      Comment


      • #4
        I think people are slowing down a little on generating data after they realized how much it takes to analyze it.

        Or maybe just not sharing.

        Comment


        • #5
          Originally posted by NicoBxl View Post
          the scale is logarithmic
          Yeah, I know. But I would expect the doubling time to be similar to the doubling time for megabases/$--currently about 6 months. Instead is appears to be at 14 months and is trending upwards.

          --
          Phillip

          Comment


          • #6
            Originally posted by GenoMax View Post
            People are hitting ISP data caps trying to upload data
            Or their Institute's bandwidth isn't up to it

            Some of our local sequencing providers can submit direct to the ENA/SRR on your behalf - the only tricky bit is providing the metadata.

            Comment


            • #7
              Perhaps people are realizing there isn't sufficient value in archiving every scrap of raw sequence data produced to justify the cost. I think there is an argument to made that as the cost for each Gbp of sequence decreases so does the value. Not long ago when producing even a Mbp of sequence meant a substantial investment in both dollars and person hours you made sure that every bp of DNA you sequenced was meaningful and to protect that investment by having your data safely stored for posterity. Now one can produce hundreds of Gbp for orders of magnitude less effort and money so researchers are a somewhat less choosey about what and how much they sequence.

              Let's be honest, how much raw sequence is ever downloaded from the ENA or SRA for research purposes. I agree with the NCBI's current stance on submission of raw sequence to the SRA. They will accept submissions of raw sequence that are directly reported on in a publication or that correlate to an analyzed data set in some other repository at NCBI (e.g. GEO, Genome, etc.)

              Comment


              • #8
                We have been sequencing like this for three to four years (see the jump in 2008) and thats about as long as most PhDs and post-docs work on a project before moving on. Maybe everyone is enjoying a long summer after a crazy time in the lab and before writing all this data up!

                Comment


                • #9
                  A growing portion of sequencing capacity is occupied by human disease studies(cancer, diabetes,etc..) and private medical/pharma sequencing. The former is not exchanged between archives due to differences in privacy laws, the latter stays private.

                  Comment


                  • #10
                    "differences in privacy laws".

                    How come there's not much public cancer data sets?

                    Are there laws preventing people from making their genome public? I imagine the motivation to help others suffering from a disease that is killing them might be pretty strong.

                    If ethics is the problem, perhaps the ethics needs to be over-hauled. A thousand eyeballs looking at some of the problems in cancer might bring a lot of solutions, particularly if patients are willing to let their genomes out.

                    Comment


                    • #11
                      A researcher can get access to cancer data through an application process, where he/she is effectively promising not to use it for non-consented research. The data is public, but with concent-based limitations.
                      What I was pointing out is that the data is not exchanged between archives due to different application processes which is due differences in privacy laws. So ENA does not count NCBI cancer data and vise versa. As a result, it is hard to calculate how much data is currently produced and archived.

                      Comment


                      • #12
                        People just aren't submitting to the SRA because its a pain in the ass honestly. Sequences are being generated in such huge volumes and speed, I think it's hard for users to keep up with submissions.

                        Comment


                        • #13
                          We saw the same trend (slowing down of SRA growth) -


                          There are three possibilities -

                          i) The exponential spike was due to US stimulus spending. Now we are seeing Tea Party decline.

                          ii) SRA scared people about shutting down early last year, and that may have forced some to change submission style.

                          iii) Everyone has too much data and they are down to analysis.
                          http://homolog.us

                          Comment

                          Latest Articles

                          Collapse

                          • seqadmin
                            Current Approaches to Protein Sequencing
                            by seqadmin


                            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                            04-04-2024, 04:25 PM
                          • seqadmin
                            Strategies for Sequencing Challenging Samples
                            by seqadmin


                            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                            03-22-2024, 06:39 AM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by seqadmin, 04-11-2024, 12:08 PM
                          0 responses
                          25 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 04-10-2024, 10:19 PM
                          0 responses
                          28 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 04-10-2024, 09:21 AM
                          0 responses
                          24 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 04-04-2024, 09:00 AM
                          0 responses
                          52 views
                          0 likes
                          Last Post seqadmin  
                          Working...
                          X