Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Publicly available NGS data?

    Hello,

    I am looking for NGS data, preferably RNA-Seq, which has large number of samples. But all is available (or I could've found) have very few samples, e.g. 3 or 4.

    Does anyone know a publicly available NGS data with many samples coming from 2 different conditions, e.g. tumor vs. normal? or any other 2 category data!

    Any help is appreciated.

  • #2
    Yes

    I have seen many papers like that, different conditions, different populations of the same species, like that.
    Here I attached a link of a paper of sequencing a whitefly, sorry not human, but the same idea, one population is pesticide resistant, and one is susceptible.

    Background The whitefly Trialeurodes vaporariorum is an economically important crop pest in temperate regions that has developed resistance to most classes of insecticides. However, the molecular mechanisms underlying resistance have not been characterised and, to date, progress has been hampered by a lack of nucleotide sequence data for this species. Here, we use pyrosequencing on the Roche 454-FLX platform to produce a substantial and annotated EST dataset. This 'unigene set' will form a critical reference point for quantitation of over-expressed messages via digital transcriptomics. Results Pyrosequencing produced around a million sequencing reads that assembled into 54,748 contigs, with an average length of 965 bp, representing a dramatic expansion of existing cDNA sequences available for T. vaporariorum (only 43 entries in GenBank at the time of this publication). BLAST searching of non-redundant databases returned 20,333 significant matches and those gene families potentially encoding gene products involved in insecticide resistance were manually curated and annotated. These include, enzymes potentially involved in the detoxification of xenobiotics and those encoding the targets of the major chemical classes of insecticides. A total of 57 P450s, 17 GSTs and 27 CCEs were identified along with 30 contigs encoding the target proteins of six different insecticide classes. Conclusion Here, we have developed new transcriptomic resources for T. vaporariorum. These include a substantial and annotated EST dataset that will serve the community studying this important crop pest and will elucidate further the molecular mechanisms underlying insecticide resistance.


    Hope this helps!

    Comment


    • #3
      Thank you Daisy-Fu

      I haven't read the whole paper yet, but just looking at the “Methods” section, I see that it says “More than 2,000 adults of each strain were collected in two separate 2 ml Eppendorf tubes and flash frozen in liquid nitrogen.”

      and considering that “A single full plate run” has been done, do you know if it means that there is sequence data available for 2000 insects for each condition or some samples are selected from the entire pool?

      Comment


      • #4
        I am still looking for publicly available NGS data with reasonably large number of samples. Especially samples form cancer and tumor tissues would be the perfect situation. By using the barcoding, it is possible to run many samples in one flow cell, but still I cannot find such data which provides separate sequence datasets for each sample.

        I appreciate any help/hints/comments on this

        Daisy-Fu,

        About the whitefly paper, I read it and it generates 2 sequence read datasets: one for insecticide susceptible standard strain (TV1) and another for resistant strain from Turkey (TV6). 2000 adults of each strain are pooled together to generate enough material for sequencing. Great deal of work is done here and it is very impressive, but in the point of view of my project, the final experiment produces 2 datasets, one for each strain. Thank you for letting me know about the paper, but I am afraid I cannot use it in my project

        Comment


        • #5
          Hey,

          try out the ncbi sra under:


          SRA abbreviates Short Reads Archive which not only means solexa/abi sequencing. You can just "google" for tissues or experiments and will recive files in SRA-Format. They also provide a tool, namely SRA Toolkit to some kind of decompress fastq, fasta, sff and stuff like that from the SRA-Archive you download. Hope that helps,


          best

          Philip

          Comment


          • #6
            Hey Philip,

            Thank you for your help and reply!

            I am using SRA. I have already found couple of datasets and working on them. The only issue is that the number of the samples is very small, usually less that 10 in both classes. I am working on a dataset by Dr. T. Wu which is one of the best ones, since it has tumor and adjacent normal tissues. However, still not many samples, 3 for each condition.

            I hoped that by posting here, I can hear back if someone knows about a study with many samples

            The Cancer Genome Atlas (TCGA) has many samples (based on the records that are publicly visible), but one needs a special permission to access that data.

            Comment


            • #7
              hey,

              i guess you will hardly find free datasets containing more than three run for a specific tissue. Such enormous projects usually don't made their data freely available....nevertheless I wish you all best luck to find some!


              if I'll find anything in near future I will contact you whether it is appreciated!

              Comment


              • #8
                Thanks Philip and I surely appreciate any help, hints or suggestions.

                Comment


                • #9
                  You may be interested in the Encode project, they provide RNA-seq data for plenty of conditions (cell lines, cell compartments..). Data can be browsed at the UCSC Table browser (the expression group proposes several RNA-seq tracks). To download, find the ftp platform. You can also try ModEncode.

                  Not sure about the number of replicates though
                  Please keep us in touch

                  Comment


                  • #10
                    Originally posted by tldgID View Post
                    I am working on a dataset by Dr. T. Wu which is one of the best ones, since it has tumor and adjacent normal tissues.
                    BTW, which one is that? Could you post the ID or ref please?

                    Comment


                    • #11
                      Hi Steven,

                      Thanks! I'll check it out.

                      Here is the link to the prostate cancer (and normal) samples:

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Current Approaches to Protein Sequencing
                        by seqadmin


                        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                        04-04-2024, 04:25 PM
                      • seqadmin
                        Strategies for Sequencing Challenging Samples
                        by seqadmin


                        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                        03-22-2024, 06:39 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, 04-11-2024, 12:08 PM
                      0 responses
                      22 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-10-2024, 10:19 PM
                      0 responses
                      24 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-10-2024, 09:21 AM
                      0 responses
                      20 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 04-04-2024, 09:00 AM
                      0 responses
                      52 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X