Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • MurielGB
    Member
    • Oct 2013
    • 51

    Convert SRA to FASTQ with fastq-dump but problem of read length

    Hello,
    I have Illumina paired end reads of length 76bp.
    The problem is that when I use fastq-dump to obtain two files with paired reads separated, it splits the reads into 101bp and 51bp rather that 76+76...
    I tried with the options --split-files, --split-spot, --split-3 and always have the same result.
    I also tried different fastq-dump versions: 1 ; 2 ; 2.3.4 and 2.3.5.2.
    Do you have an idea how I can do that ?
    Thanks !
  • GenoMax
    Senior Member
    • Feb 2008
    • 7142

    #2
    It is possible that the dataset you are looking at has asymmetric reads. Have you looked at the record in SRA to see if that is the case?

    Comment

    • MurielGB
      Member
      • Oct 2013
      • 51

      #3
      I don't know if this is possible.
      When I convert SRA to FASTQ without any option, I obtain a fastq with 152bp reads.

      Here is the page where I downloaded the sra file : http://www.ncbi.nlm.nih.gov/sra/?term=SRR1174239

      Comment

      • GenoMax
        Senior Member
        • Feb 2008
        • 7142

        #4
        Based on the record it looks like a standard 2 x 101 bp PE dataset.

        Update: The information in SRA appears to be incorrect since the dataset is dumping with 152 bp length (so would be 2 x 76).
        Last edited by GenoMax; 10-07-2014, 07:20 AM.

        Comment

        • MurielGB
          Member
          • Oct 2013
          • 51

          #5
          Yeah I agree but then why do I obtain 152bp reads when using fastq-dump ?!!

          Comment

          • GenoMax
            Senior Member
            • Feb 2008
            • 7142

            #6
            This appears to be an asymetric dataset (101 x 51) as originally suspected. See attached screencap.
            Attached Files

            Comment

            • MurielGB
              Member
              • Oct 2013
              • 51

              #7
              OK but when I do fastqc on the fastq file with 101bp reads, I obtain the attached graph that is, to me, typical of problems of read splitting with bad length.
              Attached Files

              Comment

              • GenoMax
                Senior Member
                • Feb 2008
                • 7142

                #8
                It does appear that something is fishy.

                You may want to email SRA support and ask them to look into this data set. You could also alert the submitter independently.

                Comment

                • MurielGB
                  Member
                  • Oct 2013
                  • 51

                  #9
                  OK, thanks a lot !

                  Comment

                  • aaronh
                    Member
                    • Sep 2008
                    • 46

                    #10
                    I ran into this issue with another data set. The problem was SRA miss-parsed the fastq files. A few emails between the help desk and the original depositor resulted in SRA reformatting the files.

                    Comment

                    Latest Articles

                    Collapse

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by SEQadmin2, 06-05-2026, 10:09 AM
                    0 responses
                    12 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-04-2026, 08:59 AM
                    0 responses
                    23 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-02-2026, 12:03 PM
                    0 responses
                    28 views
                    0 reactions
                    Last Post SEQadmin2  
                    Started by SEQadmin2, 06-02-2026, 11:40 AM
                    0 responses
                    22 views
                    0 reactions
                    Last Post SEQadmin2  
                    Working...