Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • indels using single end short reads!

    We had a sample with known 4bp deletion, but no tool would help me detect that...

    any suggestions?

    SSAHA supposedly does gapped alignment, but it gave me some 'novel' 1 or 2 base indels... not the one we know
    --
    bioinfosm

  • #2
    Originally posted by bioinfosm View Post
    We had a sample with known 4bp deletion, but no tool would help me detect that...

    any suggestions?

    SSAHA supposedly does gapped alignment, but it gave me some 'novel' 1 or 2 base indels... not the one we know
    SOAP may do it...it seems when you compile it, you specify how large a gap you are allowed to call for in the command line.

    "3) Maximum gap size
    -DMAXGAP=3
    Maximum size of a gap allowed in a read, then "-g" option during running should not exceed this definition."

    On the home page, they show 3 as an example, but 4 might work. I don't know how much it will slow down SOAP to allow it to try large gaps.

    I know it finds plenty of 2 bp insertions when I use -g 2.

    Comment


    • #3
      Indels with 4 bases are on the border of what I would consider "sane" when aligning/assembling short sequences. E.g., a 36mer aligned against the same sequence but with 4 bases deletion gives you a score ratio (= score/expected_score) of barely above 70%.

      I normally allow only 1 or 2 errors in Solexa mapping assemblies, but I quickly hacked together a change that will allow you to find indels or base changes with up to 4 bases in a Solexa mapping assembly. Grab http://www.chevreux.org/tmp/mira_2.9...x86_64.tar.bz2
      and run the Solexa demo. Have a look at the results in gap4 and decide for yourself whether this would fit your needs.

      Warning: Work in progress. Works for me, but not necessarily for you

      Regards,
      B.

      Comment


      • #4
        myrialign

        Maybe MyriAlign would be of use to you?
        Savannah is a central point for development, distribution and maintenance of free software, both GNU and non-GNU.

        Comment


        • #5
          SOAP worked nicely on the data... Thanks to the person who shared his script to use soap results and generate indel calls

          I was able to see the 4bp known deletion in the sample

          Torst - are you the author of Myrialign? I will check it out as well
          --
          bioinfosm

          Comment


          • #6
            Depending on your coverage, you can try assembling the reads, then simply blasting the contigs against the genome. I know of a few groups trying to do this, but I haven't heard of success, so I'm curious if you try this how far you get.

            -mark

            Comment


            • #7
              Aligning with Indels

              I've just finished a new aligner that will do indels up to 7bp. I don't have a web site for downloading it but if you'd like to try email novoalign @ gmail.com and I'll send you a copy. It's also at least as speedy as the best of the other aligners.

              Comment


              • #8
                Originally posted by bioinfosm View Post
                SOAP worked nicely on the data... Thanks to the person who shared his script to use soap results and generate indel calls

                I was able to see the 4bp known deletion in the sample
                Would said person be willing to share the scripts for using soap results? thanks in advance.

                Comment


                • #9
                  Novoalign and novopaired will do gapped alignments and is a fair bit faster than SOAP.
                  I've just released V1.03, this update improves quality scores for novopaired and also fixes a illegal instruction fault reported by one user.
                  You can download at www.novocraft.com
                  I've also changed the license term so it's free for any non-profit even if you don't publish in open journals.
                  Colin

                  Comment


                  • #10
                    Originally posted by ECO View Post
                    Would said person be willing to share the scripts for using soap results? thanks in advance.
                    Sorry but I never noticed your message in the new posts!

                    Sure, I would be happy to share. I used the soap algorithm, and then used a parsing perl script to get the results.

                    soap -a input -d reference -o prefix -s 10 -g 4

                    The parser is modified from Liu's script (BGI). You may PM me, and I will mail that to you, but would not want to put it up here..

                    sm
                    --
                    bioinfosm

                    Comment


                    • #11
                      Originally posted by bioinfosm View Post
                      We had a sample with known 4bp deletion, but no tool would help me detect that...

                      any suggestions?

                      SSAHA supposedly does gapped alignment, but it gave me some 'novel' 1 or 2 base indels... not the one we know
                      Hi!

                      Glad to read that you managed the task. Is it from a mammalian genome? If so, would you be willing to share your data set with us ( of course NDA can be done)?
                      We would love to test our mapping on that challenge.

                      Klaus

                      Comment

                      Latest Articles

                      Collapse

                      • seqadmin
                        Strategies for Sequencing Challenging Samples
                        by seqadmin


                        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                        03-22-2024, 06:39 AM
                      • seqadmin
                        Techniques and Challenges in Conservation Genomics
                        by seqadmin



                        The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                        Avian Conservation
                        Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                        03-08-2024, 10:41 AM

                      ad_right_rmr

                      Collapse

                      News

                      Collapse

                      Topics Statistics Last Post
                      Started by seqadmin, Yesterday, 06:37 PM
                      0 responses
                      10 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, Yesterday, 06:07 PM
                      0 responses
                      9 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 03-22-2024, 10:03 AM
                      0 responses
                      49 views
                      0 likes
                      Last Post seqadmin  
                      Started by seqadmin, 03-21-2024, 07:32 AM
                      0 responses
                      67 views
                      0 likes
                      Last Post seqadmin  
                      Working...
                      X