Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SRA to fastq conversion with fastq-dump loses sequences

    Hello,

    I converted an SRA archive (ftp://ftp-trace.ncbi.nlm.nih.gov/sra...953/SRR073769/) to fastq with the fastq-dump program (sratoolkit-2.1.6). The resulting fastq file had ~160,000 less sequences (2% of the total number of spots) than expected. Why does this occur?

    Thank you,

    Paul

  • #2
    I've also experienced this problem. Did you find a solution?

    Thank you,

    Elizabeth

    Comment


    • #3
      how do u know the expected number of sequences?

      Comment


      • #4
        I am seeing the same number of sequences as reported on the SRA page:



        in the file I downloaded.

        Code:
        ../sratoolkit.2.1.16-centos_linux64/bin/fastq-dump.2.1.18 SRR073769.sra 
        Written 8175900 spots for SRR073769.sra
        Written 8175900 spots total
        Code:
        $ more SRR073769.fastq | grep "@SRR073769" | wc -l
        8175900

        Comment


        • #5
          Oh, I see.

          I'm still a little confused about something:

          This file, SRR035116.sra, for example, is 3.9Gb
          When I convert it to fastq, however, it is only 2.2Gb. I checked the number of spots, and it's, surprisingly, the same.

          Usually when I convert sra to fastq, my files get a lot bigger. Help?

          Thank you!!

          Comment


          • #6
            .sra file is ~383 Mb and the .fastq file is 1.6 G (on my filesystem). If your .sra file is truly that large then something must be wrong.

            Use the aspera client that SRA provides to download the .sra file.


            Originally posted by eeh_021 View Post

            I'm still a little confused about something:

            This file, SRR035116.sra, for example, is 3.9Gb
            When I convert it to fastq, however, it is only 2.2Gb. I checked the number of spots, and it's, surprisingly, the same.

            Usually when I convert sra to fastq, my files get a lot bigger. Help?

            Thank you!!

            Comment


            • #7
              383Mb?

              If you go there, it is supposed to be 3.9Gb, and that's about how big it is when I download it...

              Comment


              • #8
                Originally posted by eeh_021 View Post
                383Mb?

                If you go there, it is supposed to be 3.9Gb, and that's about how big it is when I download it...
                Perhaps it is more straightforward to fetch it from Europe or Japan.

                Compressed files (.fastq.gz or .fastq.bz2) are just easier to use than those .sra files.


                Sébastien Boisvert

                Comment


                • #9
                  We are talking about two different data sets.

                  My response was for the dataset (SRR073769) that was in pcantalupo's original post.

                  Dataset you are referring to below is indeed 3.9 Gb.


                  Originally posted by eeh_021 View Post
                  383Mb?

                  If you go there, it is supposed to be 3.9Gb, and that's about how big it is when I download it...

                  Comment


                  • #10
                    Originally posted by eeh_021 View Post
                    Oh, I see.

                    I'm still a little confused about something:

                    This file, SRR035116.sra, for example, is 3.9Gb
                    When I convert it to fastq, however, it is only 2.2Gb. I checked the number of spots, and it's, surprisingly, the same.

                    Usually when I convert sra to fastq, my files get a lot bigger. Help?

                    Thank you!!
                    I don't think that's a problem if the fastq file gets bigger because the sra file is in binary anyway, which is more compact.

                    Comment


                    • #11
                      how to convert SRA file to FASTQ?

                      Comment


                      • #12
                        Originally posted by alireda82 View Post
                        how to convert SRA file to FASTQ?
                        Use SRA toolkit: http://eutils.ncbi.nih.gov/Traces/sr...lkit_doc&f=std

                        Comment


                        • #13
                          Hi everyone, i'm new here!
                          Can someone tell-me if it's possible to cenvert a WIG file type to FASTQ?thanks in advance

                          Comment


                          • #14
                            No, the Wig files do not contain the sequences, just the coverage.

                            Comment

                            Latest Articles

                            Collapse

                            • seqadmin
                              Essential Discoveries and Tools in Epitranscriptomics
                              by seqadmin


                              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
                              Yesterday, 07:01 AM
                            • seqadmin
                              Current Approaches to Protein Sequencing
                              by seqadmin


                              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                              04-04-2024, 04:25 PM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by seqadmin, 04-11-2024, 12:08 PM
                            0 responses
                            55 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-10-2024, 10:19 PM
                            0 responses
                            51 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-10-2024, 09:21 AM
                            0 responses
                            45 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-04-2024, 09:00 AM
                            0 responses
                            55 views
                            0 likes
                            Last Post seqadmin  
                            Working...
                            X