Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • PE SOLiD reads alignment by bwa

    Dear users,
    I have PE reads from SOLiD to align to human genome.
    I have these files:

    - solid_data_F3.csfasta
    - solid_data_F3_QV.qual
    - solid_data_F5-P2.csfasta
    - solid_data_F5-P2_QV.qual

    I want to convert in fastq these files by using bwa0.5.7/solid2fastq.pl
    This script runs only for F3 but with F5-P2 the program doesn't run. (it says Fail to open solid_data_F5-P2_F3.csfasta)

    So, if I use:
    > solid2fastq.pl solid_data_ solid_data_total
    I generate only one file fastq for F3 and F5-P2. It includes all the paired-end?

    This fastq is in colorspace but the colors are represented as ACTG.
    So to index the genome and to perform bwa alignment, have I to use -c option?

    Thanks a lot,
    ME

  • #2
    Originally posted by m_elena_bioinfo View Post
    Dear users,
    I have PE reads from SOLiD to align to human genome.
    I have these files:

    - solid_data_F3.csfasta
    - solid_data_F3_QV.qual
    - solid_data_F5-P2.csfasta
    - solid_data_F5-P2_QV.qual

    I want to convert in fastq these files by using bwa0.5.7/solid2fastq.pl
    This script runs only for F3 but with F5-P2 the program doesn't run. (it says Fail to open solid_data_F5-P2_F3.csfasta)

    So, if I use:
    > solid2fastq.pl solid_data_ solid_data_total
    I generate only one file fastq for F3 and F5-P2. It includes all the paired-end?

    This fastq is in colorspace but the colors are represented as ACTG.
    So to index the genome and to perform bwa alignment, have I to use -c option?

    Thanks a lot,
    ME
    It looks like the script doesn't support the paired end protocol. Bug the BWA mailing list ([email protected]) or the author (username:lh3).

    Comment


    • #3
      If you want to use the script with the PE data make this change in the script:

      98 #if (/^>(\d+)_(\d+)_(\d+)_[FR]3/) {
      99 if (/^>(\d+)_(\d+)_(\d+)_[F3|R3|F5-P2]/) {

      And also rename the F5-P2 to R3:

      solid_data_F5-P2.csfasta -> solid_data_R3.csfasta
      solid_data_F5-P2_QV.qual -> solid_data_R3_QV.qual

      Also, bfast has a solid2fastq (in the git repo) that supports now bwa output and
      handles PE data. You can use that too.
      -drd

      Comment


      • #4
        Thanx very much for your help Drio!
        I'll try and let you know if the program run!

        Comment


        • #5
          Originally posted by m_elena_bioinfo View Post
          Dear users,
          I have PE reads from SOLiD to align to human genome.
          I have these files:

          - solid_data_F3.csfasta
          - solid_data_F3_QV.qual
          - solid_data_F5-P2.csfasta
          - solid_data_F5-P2_QV.qual

          I want to convert in fastq these files by using bwa0.5.7/solid2fastq.pl
          This script runs only for F3 but with F5-P2 the program doesn't run. (it says Fail to open solid_data_F5-P2_F3.csfasta)

          So, if I use:
          > solid2fastq.pl solid_data_ solid_data_total
          I generate only one file fastq for F3 and F5-P2. It includes all the paired-end?

          This fastq is in colorspace but the colors are represented as ACTG.
          So to index the genome and to perform bwa alignment, have I to use -c option?

          Thanks a lot,
          ME
          You will loose a lot of information by converting the color space files to fasta, you would be better off aligning the solid reads to a color space reference

          John

          Comment


          • #6
            There is information lost because of the dinucleotide 'color' encoding but the alignments are performed in CS (http://seqanswers.com/forums/showthread.php?t=5245). BWA will do a good job aligning those reads.
            -drd

            Comment


            • #7
              Originally posted by drio View Post
              There is information lost because of the dinucleotide 'color' encoding but the alignments are performed in CS (http://seqanswers.com/forums/showthread.php?t=5245). BWA will do a good job aligning those reads.
              We utilize a modified BWA in our NextGENe software which adds a couple of additional steps to the BWA alignment, creating a much more robust alignment, addtionally, we utilize a fully annotated color space reference so no information is lost, if you would like to try, we can supply a trial.
              John

              Comment


              • #8
                Cool, any plans to integrate that into the main bwa repo?
                -drd

                Comment


                • #9
                  Thanks! Elena and drio

                  This was useful. i am trying to run the solid pe barcoded analysis.
                  I have submitted it to run just now.
                  I hope this works.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM
                  • seqadmin
                    Techniques and Challenges in Conservation Genomics
                    by seqadmin



                    The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                    Avian Conservation
                    Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                    03-08-2024, 10:41 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Yesterday, 06:37 PM
                  0 responses
                  11 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, Yesterday, 06:07 PM
                  0 responses
                  10 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-22-2024, 10:03 AM
                  0 responses
                  51 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-21-2024, 07:32 AM
                  0 responses
                  68 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X