Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SRA to fastq conversion with fastq-dump loses sequences

    Hello,

    I converted an SRA archive (ftp://ftp-trace.ncbi.nlm.nih.gov/sra...953/SRR073769/) to fastq with the fastq-dump program (sratoolkit-2.1.6). The resulting fastq file had ~160,000 less sequences (2% of the total number of spots) than expected. Why does this occur?

    Thank you,

    Paul

  • #2
    I've also experienced this problem. Did you find a solution?

    Thank you,

    Elizabeth

    Comment


    • #3
      how do u know the expected number of sequences?

      Comment


      • #4
        I am seeing the same number of sequences as reported on the SRA page:



        in the file I downloaded.

        Code:
        ../sratoolkit.2.1.16-centos_linux64/bin/fastq-dump.2.1.18 SRR073769.sra 
        Written 8175900 spots for SRR073769.sra
        Written 8175900 spots total
        Code:
        $ more SRR073769.fastq | grep "@SRR073769" | wc -l
        8175900

        Comment


        • #5
          Oh, I see.

          I'm still a little confused about something:

          This file, SRR035116.sra, for example, is 3.9Gb
          When I convert it to fastq, however, it is only 2.2Gb. I checked the number of spots, and it's, surprisingly, the same.

          Usually when I convert sra to fastq, my files get a lot bigger. Help?

          Thank you!!

          Comment


          • #6
            .sra file is ~383 Mb and the .fastq file is 1.6 G (on my filesystem). If your .sra file is truly that large then something must be wrong.

            Use the aspera client that SRA provides to download the .sra file.


            Originally posted by eeh_021 View Post

            I'm still a little confused about something:

            This file, SRR035116.sra, for example, is 3.9Gb
            When I convert it to fastq, however, it is only 2.2Gb. I checked the number of spots, and it's, surprisingly, the same.

            Usually when I convert sra to fastq, my files get a lot bigger. Help?

            Thank you!!

            Comment


            • #7
              383Mb?

              If you go there, it is supposed to be 3.9Gb, and that's about how big it is when I download it...

              Comment


              • #8
                Originally posted by eeh_021 View Post
                383Mb?

                If you go there, it is supposed to be 3.9Gb, and that's about how big it is when I download it...
                Perhaps it is more straightforward to fetch it from Europe or Japan.

                Compressed files (.fastq.gz or .fastq.bz2) are just easier to use than those .sra files.


                Sébastien Boisvert

                Comment


                • #9
                  We are talking about two different data sets.

                  My response was for the dataset (SRR073769) that was in pcantalupo's original post.

                  Dataset you are referring to below is indeed 3.9 Gb.


                  Originally posted by eeh_021 View Post
                  383Mb?

                  If you go there, it is supposed to be 3.9Gb, and that's about how big it is when I download it...

                  Comment


                  • #10
                    Originally posted by eeh_021 View Post
                    Oh, I see.

                    I'm still a little confused about something:

                    This file, SRR035116.sra, for example, is 3.9Gb
                    When I convert it to fastq, however, it is only 2.2Gb. I checked the number of spots, and it's, surprisingly, the same.

                    Usually when I convert sra to fastq, my files get a lot bigger. Help?

                    Thank you!!
                    I don't think that's a problem if the fastq file gets bigger because the sra file is in binary anyway, which is more compact.

                    Comment


                    • #11
                      how to convert SRA file to FASTQ?

                      Comment


                      • #12
                        Originally posted by alireda82 View Post
                        how to convert SRA file to FASTQ?
                        Use SRA toolkit: http://eutils.ncbi.nih.gov/Traces/sr...lkit_doc&f=std

                        Comment


                        • #13
                          Hi everyone, i'm new here!
                          Can someone tell-me if it's possible to cenvert a WIG file type to FASTQ?thanks in advance

                          Comment


                          • #14
                            No, the Wig files do not contain the sequences, just the coverage.

                            Comment

                            Latest Articles

                            Collapse

                            • seqadmin
                              Current Approaches to Protein Sequencing
                              by seqadmin


                              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                              04-04-2024, 04:25 PM
                            • seqadmin
                              Strategies for Sequencing Challenging Samples
                              by seqadmin


                              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                              03-22-2024, 06:39 AM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by seqadmin, 04-11-2024, 12:08 PM
                            0 responses
                            18 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-10-2024, 10:19 PM
                            0 responses
                            22 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-10-2024, 09:21 AM
                            0 responses
                            17 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-04-2024, 09:00 AM
                            0 responses
                            49 views
                            0 likes
                            Last Post seqadmin  
                            Working...
                            X