Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Thank you for the quick advice. I had attempted to merge many samples together at the front end of the pipeline so that I could to all the QC and error correction at once. My problem was fixed when I did QC and error correction on each sample individually and then merged for a co-assembly.

    Thanks again.

    Comment


    • Hi all,

      I was wondering why the default for spantiles is set to false. If a read for instance has coordinates (1000,1000) and the dupedist is set to 2500, (see sketch attached), there's a possible overlap with 3 other tiles. So even if it's not a NextSeq, but a HiSeq4000 for instance, there are no tile-edge duplicates, however there's still a possibility that optical duplicates end up on neighboring tiles (or even further). Can anyone elucidate on this?

      Thanks in advance!

      Attachment: The dot represents the "original read", the circle represents the distance of 2500 around the "original read". Rectangles represent tiles.
      Attached Files
      Last edited by DCZ; 05-23-2019, 07:27 AM.

      Comment


      • Illumina's software pre-processing takes care of clusters that may be showing mixed signals etc so they may never pass that step. Spantiles=t is mainly for nextSeq, where the clusters are hugh (relatively) and as a result there is a chance they will cross tiles. I believe this was done based on empirical observation Brian had done when he was developing clumpify.

        Comment


        • Thanks for your reply. I'm still confused though. Just like there can be empty wells on the same tile, there can also be empty wells on neighboring tiles (correct me if i'm wrong). I suppose these wells would not show a mixed signal but would just get filled with a duplicate in the same way as the optical duplicates get formed on the same tile.

          Comment


          • Hi, I've been using clumpify for sometime now. Thanks!
            Seem to have encountered a strange and unexpected result.
            pigz -dc test.fna.gz | grep "^>" | wc -l #4149
            ~/bbmap/clumpify.sh in=test.fna.gz out=test_dd.fna.gz dedupe subs=0
            #Version 38.51
            #Read Estimate: 352386
            ...
            #Reads In: 2
            #Clumps Formed: 2
            #Duplicates Found: 0
            #Reads Out: 2
            ...
            pigz -dc test_dd.fna.gz | grep "^>" | wc -l #2

            Any idea what might have happened?

            Comment


            • Looks like everything went fine after I 'unwrapped' the input fasta.

              Comment


              • Is there any method available to run Clumpify directly from within another program? Such as a library that could be imported? I saw that the main Clumpify program is written in Java, however, I am not a Java programmer. Not sure what other options there might be if I want my own custom program, which outputs fastq data, to pass the output directly to Clumpify, especially considering the handling the paired-end files.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Techniques and Challenges in Conservation Genomics
                  by seqadmin



                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                  Avian Conservation
                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                  03-08-2024, 10:41 AM
                • seqadmin
                  The Impact of AI in Genomic Medicine
                  by seqadmin



                  Artificial intelligence (AI) has evolved from a futuristic vision to a mainstream technology, highlighted by the introduction of tools like OpenAI's ChatGPT and Google's Gemini. In recent years, AI has become increasingly integrated into the field of genomics. This integration has enabled new scientific discoveries while simultaneously raising important ethical questions1. Interviews with two researchers at the center of this intersection provide insightful perspectives into...
                  02-26-2024, 02:07 PM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 03-14-2024, 06:13 AM
                0 responses
                34 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-08-2024, 08:03 AM
                0 responses
                72 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-07-2024, 08:13 AM
                0 responses
                81 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-06-2024, 09:51 AM
                0 responses
                68 views
                0 likes
                Last Post seqadmin  
                Working...
                X