Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • @jweger1988: I second @HESmith's suggestion.

    Create a new thread with any errors or problems you have encountered with annotation programs.

    Much as we love BBMap suite there are always going to be things that you will need to use a different program for functionality not available in BBMap suite.

    Comment


    • @HESmith, you weren't being a jerk at all. It's a fair point.

      I've created a new thread. http://seqanswers.com/forums/showthr...975#post207975. Thanks for any help you can provide.

      Comment


      • Hi Brian,

        I've been using kmercountexact and it's been very useful to give me the kmers with their counts.

        I'm wondering if any of your tools has the capability to give me a list of all of the kmers present at given length regardless of being unique or not. Basically a list that would also include the redundant kmers without counts.

        Thanks in advance for your help.

        Comment


        • For K=3 and the input file:

          Code:
          >
          AAAAA
          You would want an output file:

          Code:
          >
          AAA
          >
          AAA
          >
          AAA
          Is that correct? I don't have anything that will do that; sorry. What did you want to use it for?

          Comment


          • Thanks for the reply. That is correct.

            I have a virus that I introduced some degenerate nucleotides in to track bottlenecks.

            I suppose I could just reformat to the area of the read I'm interested in and then just convert to fasta and use that.

            Comment


            • Hi - I love that bbmap and its tools can directly make bam files. I noticed that it's using samtools with 8 threads. Is there a way to increase the number of threads?

              Thanks

              Comment


              • Oh, yep, for some reason I capped it at 8 threads. I wonder why? I'll eliminate that cap in the next release, which will probably be sometime today.

                Comment


                • Originally posted by Brian Bushnell View Post
                  Oh, yep, for some reason I capped it at 8 threads. I wonder why? I'll eliminate that cap in the next release, which will probably be sometime today.
                  How about tying the number to the number of threads specified for BBMap? That way we know that many threads are available.
                  Last edited by GenoMax; 08-02-2017, 10:00 AM.

                  Comment


                  • Originally posted by GenoMax View Post
                    How about tying the number to the number of threads specified for BBMap? That way we know that many threads are available.
                    It is tied to the number of threads defined for BBMap, just for some reason I capped it at a max of 8 even if the main process was allowed to use more; probably to conserve memory. I've increased it to a max of 64.

                    Comment


                    • Originally posted by Brian Bushnell View Post
                      It is tied to the number of threads defined for BBMap, just for some reason I capped it at a max of 8 even if the main process was allowed to use more; probably to conserve memory. I've increased it to a max of 64.
                      Thanks- that helps a lot!

                      Comment


                      • bbmap fast macro?

                        Hi Brian,
                        I have a lot of reference sequences I'm mapping to (~11 million) and want to eek out as much as speed as possible.

                        I'm mostly looking for close matches - ex. I set minid to 0.97. Will setting fast still find matches like that? Any other thoughts on what I can set to get more speed?

                        Thanks a bunch!

                        Comment


                        • Originally posted by darthsequencer View Post
                          Hi Brian,
                          I have a lot of reference sequences I'm mapping to (~11 million)
                          How long are the query sequences?

                          Comment


                          • To maximize speed when you are not looking for low-identity matches, "fast" (plus your identity threshold) is generally adequate. You can also speed it up by reducing "maxindel" (fast sets it to 80). Quality-trimming and adapter-trimming generally increase alignment speed.

                            With a large reference you may be able to increase speed with "k=14" instead of the default "k=13" - this increases the time to load the reference and memory usage, but increases mapping speed (so whether the process becomes faster or slower depends on how long it takes to load the reference compared to how much data you have to map). Also, turning off mate rescue (rescue=f) or reducing rescuedist (fast defaults to rescuedist=800) can also increase the speed slightly. Note that all of these options reduce sensitivity (aside from trimming which increases it), but at 97% identity you only need very low sensitivity anyway.
                            Last edited by Brian Bushnell; 08-09-2017, 11:01 AM.

                            Comment


                            • Originally posted by GenoMax View Post
                              How long are the query sequences?
                              They range between 50bp single end to 2 x 250bp

                              Comment


                              • Originally posted by Brian Bushnell View Post
                                To maximize speed when you are not looking for low-identity matches, "fast" (plus your identity threshold) is generally adequate. You can also speed it up by reducing "maxindel" (fast sets it to 80). Quality-trimming and adapter-trimming generally increase alignment speed.

                                With a large reference you may be able to increase speed with "k=14" instead of the default "k=13" - this increases the time to load the reference and memory usage, but increases mapping speed (so whether the process becomes faster or slower depends on how long it takes to load the reference compared to how much data you have to map). Also, turning off mate rescue (rescue=f) or reducing rescuedist (fast defaults to rescuedist=800) can also increase the speed slightly. Note that all of these options reduce sensitivity (aside from trimming which increases it), but at 97% identity you only need very low sensitivity anyway.
                                Thanks that's helpful. On the note of loading references - is there a way to use wildcards with the input and output of bbwrap?

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Genetic Variation in Immunogenetics and Antibody Diversity
                                  by seqadmin



                                  The field of immunogenetics explores how genetic variations influence immune responses and susceptibility to disease. In a recent SEQanswers webinar, Oscar Rodriguez, Ph.D., Postdoctoral Researcher at the University of Louisville, and Ruben Martínez Barricarte, Ph.D., Assistant Professor of Medicine at Vanderbilt University, shared recent advancements in immunogenetics. This article discusses their research on genetic variation in antibody loci, antibody production processes,...
                                  11-06-2024, 07:24 PM
                                • seqadmin
                                  Choosing Between NGS and qPCR
                                  by seqadmin



                                  Next-generation sequencing (NGS) and quantitative polymerase chain reaction (qPCR) are essential techniques for investigating the genome, transcriptome, and epigenome. In many cases, choosing the appropriate technique is straightforward, but in others, it can be more challenging to determine the most effective option. A simple distinction is that smaller, more focused projects are typically better suited for qPCR, while larger, more complex datasets benefit from NGS. However,...
                                  10-18-2024, 07:11 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 11-08-2024, 11:09 AM
                                0 responses
                                128 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 11-08-2024, 06:13 AM
                                0 responses
                                97 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 11-01-2024, 06:09 AM
                                0 responses
                                67 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 10-30-2024, 05:31 AM
                                0 responses
                                25 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X