Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Pulling Data from NCBI Based on Simple Criteria

    I have spent a good hour reading about NCBI BioSamples and BioProjects and using their searches to download data from environments that have metadata for "polar" and "marine" environments that were produced by 16S rRNA gene amplicon sequencing.

    I post this thread b/c I have had very little success in finding useful datasets. I expected that my search criteria are simple enough to generate a decent list of datasets, but I've had to manually pick through a very non-specific list of hits.

    Can someone comment on a good workflow to achieve what I am aiming at, or provide me with a good walk-through? Surely, this kind of basic data retrieval is common practice and should be easier than I'm finding it... right?

    Thanks in advance,
    Roli

  • #2
    I am going to hazard a guess that you can only find what is there in the first place. Sounds like sequence submitters may not be doing a good job of submitting adequate metadata.

    That said, I had recently found an R-based package to search SRA metadata and posted it in one of the threads. You can search here or I can look for that thread. Perhaps that may help.

    Here is that post: http://seqanswers.com/forums/showpos...45&postcount=9
    Last edited by GenoMax; 11-13-2015, 06:06 PM.

    Comment


    • #3
      @GenoMax's answer is certainly more useful than mine.

      I think it's a pipe dream to think that you will find all the datasets neatly organized on NCBI. You might be better off just using Google to find the articles published by the studies, and then locate the datasets.

      Inevitably, the raw datasets will be hard to locate, and incomplete. The file formats will differ from one study to the other. The library preparation protocols, and data processing steps, will vary from one study to another, and will be poorly documented.

      I myself submit data to NCBI, albeit related to human health, and I can tell you that it's a mess. Even today, there is no consensus on what data to submit, or under what format. You can imagine how it was for datasets collected a few years ago, at the dawn of next generation sequencing.

      Researchers are mainly interested in getting their paper published. Since most journals now require that the dataset be uploaded to NCBI, researchers will do so. However, providing the data to the public in a neat and organized manner is not a major preoccupation, and even for a conscientious researcher, it's not always clear under what format the data should be uploaded or with which accompanying information.
      Last edited by blancha; 11-13-2015, 07:39 PM.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM
      • seqadmin
        Techniques and Challenges in Conservation Genomics
        by seqadmin



        The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

        Avian Conservation
        Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
        03-08-2024, 10:41 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 03-27-2024, 06:37 PM
      0 responses
      12 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-27-2024, 06:07 PM
      0 responses
      11 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-22-2024, 10:03 AM
      0 responses
      52 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 03-21-2024, 07:32 AM
      0 responses
      68 views
      0 likes
      Last Post seqadmin  
      Working...
      X