Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to choose aligners?

    Dear all:

    For some reasons, I need align my short read sequence as following conditions:

    refseq: ATCCGATTGCCTCCAAATGCCCTAAATCGTA
    my_sq: ATCC-AT-GCCTC-AAATGCCC-AAA-CG-A

    (1) for the first 18 nt (red colored) from 5':
    a. set as a seed
    b. allow 3 mismatches (shown as red "-")
    (2) Allow total 6 mismatches (rest mismatches as blue "-")

    I tried to find some aligners, but I cannot find a good one.
    For example, if I use bowtie with -n and allow 3 mismatch on seed, i cannot set the parameters for total 6 mismatches allowed.

    Do you have any suggestions? Thanks a lot!!

    Best,
    Yi
    Yi John Huang (PhD student)
    886-3-2118800 ext. 3731
    Graduate Institute of Biomedical Science, Chang Gung University

  • #2
    Give Mosaik a try - adjust -mm and -gop/-gap.

    Comment


    • #3
      Originally posted by hajime View Post
      Dear all:

      For some reasons, I need align my short read sequence as following conditions:

      refseq: ATCCGATTGCCTCCAAATGCCCTAAATCGTA
      my_sq: ATCC-AT-GCCTC-AAATGCCC-AAA-CG-A
      Are these deletes or mismatches? If they are deletes, you can't use bowtie - it does substitution mismatches only, at least for version 1.

      BWA may do what you want, but generally NGS aligners don't handle reads with many mismatches / indels all that well.

      Comment


      • #4
        @Kga1978: Thanks for your suggestion. I'll try it.

        @Tonybolger: the "-" indicates mismatches only (not small indel or other variant types). I'm sorry to let you feel confused. By the way, I'm not sure how the BWA can do what I want. Could you tell me more detail about that? thanks a lot
        Yi John Huang (PhD student)
        886-3-2118800 ext. 3731
        Graduate Institute of Biomedical Science, Chang Gung University

        Comment


        • #5
          Originally posted by hajime View Post
          @Kga1978: Thanks for your suggestion. I'll try it.

          @Tonybolger: the "-" indicates mismatches only (not small indel or other variant types). I'm sorry to let you feel confused.
          OK, but normally '-' is used to as a gap-filler when there's a delete in a sequence.

          Originally posted by hajime View Post
          By the way, I'm not sure how the BWA can do what I want. Could you tell me more detail about that? thanks a lot
          A quick look at the BWA manual suggests that the combination "-n 6 -l 18 -k 3" will probably do what i think you want (you might also need/want to disable gaps).

          Comment


          • #6
            Originally posted by tonybolger View Post
            OK, but normally '-' is used to as a gap-filler when there's a delete in a sequence.


            A quick look at the BWA manual suggests that the combination "-n 6 -l 18 -k 3" will probably do what i think you want (you might also need/want to disable gaps).
            Thanks for your kind and quick reply.

            Actually, I did read BWA manual before I reply my last post.
            However, I think I misunderstood and got confused about the meaning of "-n" parameter.

            According to the description on the BWA website:
            -------------------------
            -n NUM Maximum edit distance if the value is INT, or the fraction of missing alignments given 2% uniform base error rate if FLOAT. In the latter case, the maximum edit distance is automatically chosen for different read lengths.
            -------------------------

            Based on your reply, I'd like to know whether you consider that "-n 6" and no gap allowed is the same as 6 mismatches in the read.

            Thanks again!
            Yi John Huang (PhD student)
            886-3-2118800 ext. 3731
            Graduate Institute of Biomedical Science, Chang Gung University

            Comment


            • #7
              Originally posted by hajime View Post
              Based on your reply, I'd like to know whether you consider that "-n 6" and no gap allowed is the same as 6 mismatches in the read.
              I guess that it is, but i would suggest you test it to make sure it does what you want.

              Comment


              • #8
                I'm not certain if BFAST would work with such a small seed (you would need to modify the indexes), but it's very tolerant of mismatches.

                Comment


                • #9
                  Because I'm waiting for sequencing done, I cannot test any suggestions using real data right now. I'll tried to generate some fake reads or download others' data for furthered test.

                  Thanks for all the suggestions. Thank you guys!!
                  Yi John Huang (PhD student)
                  886-3-2118800 ext. 3731
                  Graduate Institute of Biomedical Science, Chang Gung University

                  Comment


                  • #10
                    take a look at http://seqanswers.com/forums/showthread.php?t=15200
                    Marco

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Strategies for Sequencing Challenging Samples
                      by seqadmin


                      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                      03-22-2024, 06:39 AM
                    • seqadmin
                      Techniques and Challenges in Conservation Genomics
                      by seqadmin



                      The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                      Avian Conservation
                      Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                      03-08-2024, 10:41 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, Yesterday, 06:37 PM
                    0 responses
                    11 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, Yesterday, 06:07 PM
                    0 responses
                    10 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 03-22-2024, 10:03 AM
                    0 responses
                    51 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 03-21-2024, 07:32 AM
                    0 responses
                    67 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X