Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Originally posted by apostrophe View Post
    ...does Bowtie support FASTA nucleic acid codes that code for two bases, such as Y = T or C for the genome? Thanks in advance.
    Bowtie will index and align against references containing non-A/C/G/T characters, but alignments overlapping non-A/C/G/T characters in the reference are invalid and won't be reported.

    Out of curiosity, what's the behavior you would like? E.g. if a C in a read were to align against a Y in the genome, would you like that to be considered a match, incurring no penalty against the alignment?

    Thanks,
    Ben

    Comment


    • I was hoping to use Bowtie in order to align a large amount of reads against a genome that has SNPs in the stated format above. If not, I suppose I'll have to figure out some other method of alignment.

      Thanks for your quick reply!

      Comment


      • Hi Ben,

        Thanks for support.

        I am trying to compare the eland and Bowtie results. Many reads are not getting mapped using Bowtie where as eland reports as unique tags without any mismatch. An example would be as follows:

        Code:
        >read1 AGTCTGTTTATGTTCAGCACAATTTTTTTTTTTTG  U0  1   0  0  chr8.fa 37178235  R DD
        Where as Bowtie result for the above read is as follows:
        Code:
        ./bowtie -a -m 10 -n 2 --strata --best -p 15 ../Genome/hg18/hg18 -c AGTCTGTTTATGTTCAGCACAATTTTTTTTTTTTG
        No results
        I have build the reference genome with default parameters.
        Code:
        ./bowtie-build <reference_in> <index_baename>
        Why Bowtie is not reporting the mapping?
        Please let me know whether any changes in the parameters needs to be done.

        And also my query would be how Bowtie handles if there are "N"s in the query reads?

        Thanks.

        Comment


        • Hi seq_GA,

          Originally posted by seq_GA View Post
          I am trying to compare the eland and Bowtie results. Many reads are not getting mapped using Bowtie where as eland reports as unique tags without any mismatch. An example would be as follows:

          Code:
          >read1 AGTCTGTTTATGTTCAGCACAATTTTTTTTTTTTG  U0  1   0  0  chr8.fa 37178235  R DD
          Where as Bowtie result for the above read is as follows:
          Code:
          ./bowtie -a -m 10 -n 2 --strata --best -p 15 ../Genome/hg18/hg18 -c AGTCTGTTTATGTTCAGCACAATTTTTTTTTTTTG
          No results
          Can you confirm that it ought to align by looking at the reference? I don't have the hg18 index lying around, but in the h_sapiens_asm index, your example aligns uniquely with 3 mismatches:

          Code:
          ./bowtie -a -v 3 /fs/szasmg/langmead/ebwts/h_sapiens_asm -c AGTCTGTTTATGTTCAGCACAATTTTTTTTTTTTG
          0	-	gi|51511724|ref|NC_000008.9|NC_000008	37178227	CAAAAAAAAAAAATTGTGCTGAACATAAACAGACT	IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII	0	31:G>A,33:C>A,34:T>C
          Reported 1 alignments to 1 output stream(s)
          Also, if you want the output to look like Eland, you should use -v 2 instead of -n 2. -n 2 activates a Maq-like alignment policy.

          And also my query would be how Bowtie handles if there are "N"s in the query reads?
          An N in a read counts always counts as a mismatch in the alignment.

          Thanks,
          Ben

          Comment


          • >read1 AGTCTGTTTATGTTCAGCACAATTTTTTTTTTTTG U0 1 0 0 chr8.fa 37178235 R DD
            I'm not sure, but didn't the earlier eland only report mismatches over the first 32 bases? Hence mismatches in the final base reads would still allow a U0?

            Comment


            • Hi Ben,

              Thanks for your prompt response.

              with -v 3, Bowtie is also reporting one mapping location.

              I want to use seedlength as 28(default) with 2 mismatches. hence I used -n 2 since I am comparing eland_28 and Bowtie results.

              But still why Bowtie is not reporting?
              Last edited by seq_GA; 07-14-2009, 11:47 PM.

              Comment


              • Hi Ben,
                I did a quick comparison on with -v 2 and -n 2.

                The reads are 35bps length and i used -3 6 to trim 3` sequences and hence my mappabale reads would be 28 in size in order for me to compare eland_28 results.

                Code:
                 bowtie -a -m 10 -v 2 --strata --best --solexa-quals  -p 15 -3 ../../Genome/hg18/hg18 ../s_1_sequence.txt out_aln.txt

                When I look at the unque ly mapped tags with -v 2 is more than with -n 2.

                Can you please explain me why there are more number of mapping when -v 2?

                Thanks.

                Comment


                • Originally posted by seq_GA View Post
                  with -v 3, Bowtie is also reporting one mapping location.

                  I want to use seedlength as 28(default) with 2 mismatches. hence I used -n 2 since I am comparing eland_28 and Bowtie results.

                  But still why Bowtie is not reporting?
                  Probably because the -e limit is disqualifying that alignment. If you'd like Bowtie to report alignments like that, try setting a higher -e than the default (70). -e is described in the Maq-like Policy section of the manual.

                  Ben

                  Comment


                  • Originally posted by seq_GA View Post
                    Can you please explain me why there are more number of mapping when -v 2?
                    Probably the -e limit again. See my previous post.

                    Ben

                    Comment


                    • Originally posted by Ben Langmead View Post
                      Bowtie will index and align against references containing non-A/C/G/T characters, but alignments overlapping non-A/C/G/T characters in the reference are invalid and won't be reported.

                      Out of curiosity, what's the behavior you would like? E.g. if a C in a read were to align against a Y in the genome, would you like that to be considered a match, incurring no penalty against the alignment?

                      Thanks,
                      Ben
                      The reason out group would like this functionality is because we are investigating performing DNA methylation analysis via illumina bisulfite sequencing -> in this case C nucleotides in the normal genome will either be C or T nucleotides in the bisulfute converted genome.

                      So our preferred behavior would be to not penalise either the C or T (if the reference contained a Y at this position)

                      Anyway I find bowtie very useful, thanks for all your work!

                      Comment


                      • Hi Chuck,

                        Originally posted by chuck View Post
                        I tried bowtie remade with extraflags but it just did the same thing. Would there be a log file somewhere or something in the map file? I can't seem to find any additional output.
                        If you have a moment, could you try your run again using the latest version of Bowtie (0.10.1, released on Monday).

                        Thanks,
                        Ben

                        Comment


                        • Originally posted by Ben Langmead View Post
                          Probably the -e limit again. See my previous post.

                          Ben
                          Hi Ben,

                          I am trying to get as many mapping as eland reports and trying to play around with Bowtie's parameters.
                          As you had suggested earlier, I tried using -e till 2000 to increase the mapping as good as eland but still Bowtie misses a lot of mappings when compared to eland.

                          -v option would give a comparable results ( I tested for read length 28 which is also the seed length) as eland but with the increasing number of Ns in the 3`end, it would be good to use -n option and try to allow any number of mismatches beyond seed length.

                          And hence any suggestions to increase the mapping rate of Bowtie using -n options?

                          Thanks.

                          Comment


                          • Originally posted by seq_GA View Post
                            And hence any suggestions to increase the mapping rate of Bowtie using -n options?
                            The main options used to adjust the sensitivity of mapping in Maq-like alignment mode are -n, -l, -e, --maxbts/-y. If there is a particular alignment you think Bowtie should be finding but isn't, please let me know and I can take a look.

                            Thanks,
                            Ben

                            Comment


                            • Hi Ben,

                              I've been teaching and not working on the data lately. I will give it a try soon.

                              I have a question for you about assembly quality evaluation, in two contexts.

                              1) to simply evaluate the quality of the assembly of the short reads against the reference sequences, beyond simple coverage
                              2) when there are actual differences between the sequenced genome and the reference genome, in finding indels and whatnot

                              I am looking AMOS, which seems to be one of the few that provide some kind of quality score for the assembly. Are you aware of others?

                              I am trying to quickly narrow my analysis down to those de novo contigs with good assembly scores. I proposed a simple metric in a manuscript and the reviewer suggested I use other 'standard' measures but gave no pointers as to which ones I should be using. Things are changing so fast it is hard to keep track of the 'standard'...

                              Thanks,
                              Chuck

                              Comment


                              • Ben,

                                I used the latest version 0.10.1 and it still hangs. It seems to complete the job (or almost, I haven't verified that fact yet) and stops writing to the output file but then it never closes.

                                Do you want me to run the debug version or try the extra flags again?

                                Chuck

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Current Approaches to Protein Sequencing
                                  by seqadmin


                                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                  04-04-2024, 04:25 PM
                                • seqadmin
                                  Strategies for Sequencing Challenging Samples
                                  by seqadmin


                                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                  03-22-2024, 06:39 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 04-11-2024, 12:08 PM
                                0 responses
                                22 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 10:19 PM
                                0 responses
                                24 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 09:21 AM
                                0 responses
                                20 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-04-2024, 09:00 AM
                                0 responses
                                52 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X