Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • BWA - samse

    Hi
    I am a new user of BWA. I downloaded the 0.5.7.
    My aim is to align illumina short reads on the human genome.
    First I had problems with bwa samse segmentation fault - as reported by others on this site. Thus I used the program available via MAQ to convert my sequences : fq_all2std.pl
    Currently my FASTQ file looks like that that
    (...)
    @15
    NGCANGGCCAGAATGTTTACTCCTTTGGCTCCGTG
    +
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%!
    @16
    NTAGNGCAAAACCATCAATACAAGACTATAGCTGC
    +
    &,;,&,;;;;98;9;;;;;9;;888;;;9;99;;;!
    @17
    NCCANCGTCTTGTCTCCGCATACAAGTGGGTCCAT
    +
    &/6/&/512866647/025266450585)4676%%!
    @18
    NTTCNCCAGACAGGACAGAAAGGACAGCAGGTGTC
    +
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%!
    (...)
    I hope it's the BWA required format...

    My first tests were on a small part of my reads. It worked quickly, without problem.
    When I test my complete set, i.e. one file of 5Gb and one other of 10Gb, I'm not sure it works. It is running from Tuesday, and the only line on both .sam files is "[bwa_read_seq] 0.0% bases are trimmed."
    Please let me know if it is normal or not. If not what kind of problem have I please. How long does a complete alignment take place (with human reads and genome and without option modification) please?
    Thanks a lot for your help

  • #2
    If I'm not mistaken, your quality scores appear to be really low.. I usually convert my Illumina reads to Sanger FASTQ using the 'sol2sanger' feature in the Maq package. You might want to try it out and see how the quality scores compare.

    Also, are you using the -t option of 'bwa aln' in order to take advantage of multiple CPUs?

    I believe that I was able to align ~7GB of Illumina reads (76bp SE) to the whole human genome in 3-4 hours on an 8-core workstation running Ubuntu Linux.

    Originally posted by giverny View Post
    Hi
    I am a new user of BWA. I downloaded the 0.5.7.
    My aim is to align illumina short reads on the human genome.
    First I had problems with bwa samse segmentation fault - as reported by others on this site. Thus I used the program available via MAQ to convert my sequences : fq_all2std.pl
    Currently my FASTQ file looks like that that
    (...)
    @15
    NGCANGGCCAGAATGTTTACTCCTTTGGCTCCGTG
    +
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%!
    @16
    NTAGNGCAAAACCATCAATACAAGACTATAGCTGC
    +
    &,;,&,;;;;98;9;;;;;9;;888;;;9;99;;;!
    @17
    NCCANCGTCTTGTCTCCGCATACAAGTGGGTCCAT
    +
    &/6/&/512866647/025266450585)4676%%!
    @18
    NTTCNCCAGACAGGACAGAAAGGACAGCAGGTGTC
    +
    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%!
    (...)
    I hope it's the BWA required format...

    My first tests were on a small part of my reads. It worked quickly, without problem.
    When I test my complete set, i.e. one file of 5Gb and one other of 10Gb, I'm not sure it works. It is running from Tuesday, and the only line on both .sam files is "[bwa_read_seq] 0.0% bases are trimmed."
    Please let me know if it is normal or not. If not what kind of problem have I please. How long does a complete alignment take place (with human reads and genome and without option modification) please?
    Thanks a lot for your help

    Comment


    • #3
      Originally posted by sperry View Post
      If I'm not mistaken, your quality scores appear to be really low.. I usually convert my Illumina reads to Sanger FASTQ using the 'sol2sanger' feature in the Maq package. You might want to try it out and see how the quality scores compare.

      Also, are you using the -t option of 'bwa aln' in order to take advantage of multiple CPUs?

      I believe that I was able to align ~7GB of Illumina reads (76bp SE) to the whole human genome in 3-4 hours on an 8-core workstation running Ubuntu Linux.
      Thanks for your answer and sorry for the delay in getting back to you.
      Yes the quality of these lines is not the best quality I have on the set... it was just few examples
      Finally the problem was relative to the fastq file.
      For sure it's more quick now ... and I have results.
      Have a good day and thanks again

      Comment


      • #4
        same problem

        Hi guys,

        I have exactly the same problem!!
        Giverny I would be very greateful if you could describe what was the problem with your fastq file!

        best ro

        Comment


        • #5
          Hi, How did you fix the problem of fastq format. I am using maq's sol2sanger program and still get segmentation fault. Please explain. Thanks.

          Comment


          • #6
            Hi, did anybody find who to figure out this problem?
            i am the same problem but i couldn't find any problem to my fastq files,
            thanks

            Comment


            • #7
              I had BWA segmentation fault issues with bwa aln. It turned out my reference fasta file was somehow damaged (I used cat to combine the equine chromosome files into one). Once I received a working genome file, it worked without issues.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM
              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              18 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              22 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              16 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              47 views
              0 likes
              Last Post seqadmin  
              Working...
              X