Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BWA - samse

    Hi
    I am a new user of BWA. I downloaded the 0.5.7.
    My aim is to align illumina short reads on the human genome.
    First I had problems with bwa samse segmentation fault - as reported by others on this site. Thus I used the program available via MAQ to convert my sequences : fq_all2std.pl
    Currently my FASTQ file looks like that that
    (...)
    @15
    NGCANGGCCAGAATGTTTACTCCTTTGGCTCCGTG
    +
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%!
    @16
    NTAGNGCAAAACCATCAATACAAGACTATAGCTGC
    +
    &,;,&,;;;;98;9;;;;;9;;888;;;9;99;;;!
    @17
    NCCANCGTCTTGTCTCCGCATACAAGTGGGTCCAT
    +
    &/6/&/512866647/025266450585)4676%%!
    @18
    NTTCNCCAGACAGGACAGAAAGGACAGCAGGTGTC
    +
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%!
    (...)
    I hope it's the BWA required format...

    My first tests were on a small part of my reads. It worked quickly, without problem.
    When I test my complete set, i.e. one file of 5Gb and one other of 10Gb, I'm not sure it works. It is running from Tuesday, and the only line on both .sam files is "[bwa_read_seq] 0.0% bases are trimmed."
    Please let me know if it is normal or not. If not what kind of problem have I please. How long does a complete alignment take place (with human reads and genome and without option modification) please?
    Thanks a lot for your help

  • #2
    If I'm not mistaken, your quality scores appear to be really low.. I usually convert my Illumina reads to Sanger FASTQ using the 'sol2sanger' feature in the Maq package. You might want to try it out and see how the quality scores compare.

    Also, are you using the -t option of 'bwa aln' in order to take advantage of multiple CPUs?

    I believe that I was able to align ~7GB of Illumina reads (76bp SE) to the whole human genome in 3-4 hours on an 8-core workstation running Ubuntu Linux.

    Originally posted by giverny View Post
    Hi
    I am a new user of BWA. I downloaded the 0.5.7.
    My aim is to align illumina short reads on the human genome.
    First I had problems with bwa samse segmentation fault - as reported by others on this site. Thus I used the program available via MAQ to convert my sequences : fq_all2std.pl
    Currently my FASTQ file looks like that that
    (...)
    @15
    NGCANGGCCAGAATGTTTACTCCTTTGGCTCCGTG
    +
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%!
    @16
    NTAGNGCAAAACCATCAATACAAGACTATAGCTGC
    +
    &,;,&,;;;;98;9;;;;;9;;888;;;9;99;;;!
    @17
    NCCANCGTCTTGTCTCCGCATACAAGTGGGTCCAT
    +
    &/6/&/512866647/025266450585)4676%%!
    @18
    NTTCNCCAGACAGGACAGAAAGGACAGCAGGTGTC
    +
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%!
    (...)
    I hope it's the BWA required format...

    My first tests were on a small part of my reads. It worked quickly, without problem.
    When I test my complete set, i.e. one file of 5Gb and one other of 10Gb, I'm not sure it works. It is running from Tuesday, and the only line on both .sam files is "[bwa_read_seq] 0.0% bases are trimmed."
    Please let me know if it is normal or not. If not what kind of problem have I please. How long does a complete alignment take place (with human reads and genome and without option modification) please?
    Thanks a lot for your help

    Comment


    • #3
      Originally posted by sperry View Post
      If I'm not mistaken, your quality scores appear to be really low.. I usually convert my Illumina reads to Sanger FASTQ using the 'sol2sanger' feature in the Maq package. You might want to try it out and see how the quality scores compare.

      Also, are you using the -t option of 'bwa aln' in order to take advantage of multiple CPUs?

      I believe that I was able to align ~7GB of Illumina reads (76bp SE) to the whole human genome in 3-4 hours on an 8-core workstation running Ubuntu Linux.
      Thanks for your answer and sorry for the delay in getting back to you.
      Yes the quality of these lines is not the best quality I have on the set... it was just few examples
      Finally the problem was relative to the fastq file.
      For sure it's more quick now ... and I have results.
      Have a good day and thanks again

      Comment


      • #4
        same problem

        Hi guys,

        I have exactly the same problem!!
        Giverny I would be very greateful if you could describe what was the problem with your fastq file!

        best ro

        Comment


        • #5
          Hi, How did you fix the problem of fastq format. I am using maq's sol2sanger program and still get segmentation fault. Please explain. Thanks.

          Comment


          • #6
            Hi, did anybody find who to figure out this problem?
            i am the same problem but i couldn't find any problem to my fastq files,
            thanks

            Comment


            • #7
              I had BWA segmentation fault issues with bwa aln. It turned out my reference fasta file was somehow damaged (I used cat to combine the equine chromosome files into one). Once I received a working genome file, it worked without issues.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Recent Innovations in Spatial Biology
                by seqadmin


                Spatial biology is an exciting field that encompasses a wide range of techniques and technologies aimed at mapping the organization and interactions of various biomolecules in their native environments. As this area of research progresses, new tools and methodologies are being introduced, accompanied by efforts to establish benchmarking standards and drive technological innovation.

                3D Genomics
                While spatial biology often involves studying proteins and RNAs in their...
                01-01-2025, 07:30 PM
              • seqadmin
                Advancing Precision Medicine for Rare Diseases in Children
                by seqadmin




                Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                12-16-2024, 07:57 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 01-09-2025, 04:04 PM
              0 responses
              432 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 01-09-2025, 09:42 AM
              0 responses
              441 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 01-08-2025, 03:17 PM
              0 responses
              454 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 01-03-2025, 11:18 AM
              1 response
              50 views
              1 like
              Last Post Tonia
              by Tonia
               
              Working...
              X