Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • RNA-seq analysis raw data format

    Hello,

    I am relatively new to this sequencing field. can anybody guide me with the basics and tell me what is the RNA-seq analysis raw data format. Is it .bed files or fastq files. How to find out what all files in GEO or array express is having RNA-seq data?

  • #2
    in general it's fastq format

    Comment


    • #3
      For GEO, you can find RNA-seq experiments by filtering on "series type":

      Gene Expression Omnibus (GEO) is a database repository of high throughput gene expression data and hybridization arrays, chips, microarrays.


      On ArrayExpress, go to



      and select "RNA assay" and "high-throughput sequencing" from two of the menus.

      Comment


      • #4
        Thanks a lot for your reply..i will try searching..

        Comment


        • #5
          i am actually having a problem, i opened this page from GEO
          NCBI's Gene Expression Omnibus (GEO) is a public archive and resource for gene expression data.

          this is an RNA-seq analysis for sure, if i need the data and i scroll down to the files attached to this page, i get .txt files and also files which say are for SRA study. now how should i take the data from this page? can the data be in .txt format also. also want to know where can we use the data from SRA study.

          Comment


          • #6
            For introduction to RNA-seq see: http://seqanswers.com/wiki/How-to/RNASeq_analysis

            and: sometimes fastq files got the ending .txt as windows users won't recognize text files as such if they do not have this ending. They may be fastq files even with the .txt ending. (I haven't looked into these files, they may be something else as well)

            Comment


            • #7
              If you click on one of the samples (e.g. going to here) and look in the "Data processing" section, they mention the file type and where to find the actual specification for it. The SRA files could be converted to fastq format with the SRA toolkit. I should note, whether you actually want to redo the alignment yourself (i.e. downloading the SRA files, converting them to fastq, alignment with tophat or whatever) or directly use the prealigned files depends a bit on what your goals are.

              BTW, if you need the reads aligned to hg19 instead of hg18, you can google for the very useful "liftOver" tool.

              Comment


              • #8
                thanks a ton for your reply dpryan and peter. your links are really helpful. one more doubt now arises is that is there a way by which we can get prealigned files. i till now presumed that we get only RAW data from GEO and AE, and we have to compulsorily align it to process further. Can we get SAM and BAM files also ?

                Comment


                • #9
                  Unfortunately I don't think there's a single answer to that question that applies to all datasets. I've used a number of datasets that provided prealigned BED or similar files, but that's certainly not the case with all of them.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Current Approaches to Protein Sequencing
                    by seqadmin


                    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                    04-04-2024, 04:25 PM
                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 04-11-2024, 12:08 PM
                  0 responses
                  18 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 10:19 PM
                  0 responses
                  22 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 09:21 AM
                  0 responses
                  17 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-04-2024, 09:00 AM
                  0 responses
                  49 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X