Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • bioinf
    Member
    • Nov 2010
    • 25

    #16
    Can someone please explain why we need to have the HashMap and store there the id of the first read where that k-mer was encountered? Is it not just sufficient to walk the graph and write down the k-mers to build up the original sequence? What is this HashMap else used for?

    Comment

    • Zigster
      Jeremy Leipzig
      • May 2009
      • 117

      #17
      I don't know where you got the word "HashMap" from - I think that is Java. Any association between reads and their kmers is for the purposes of paired-end resolution and read usage statistics.

      Are you going to put this presentation up somewhere?
      --
      Jeremy Leipzig
      Bioinformatics Programmer
      --
      My blog
      Twitter

      Comment

      • bioinf
        Member
        • Nov 2010
        • 25

        #18
        I'm no biologist I'm a programmer. Hash map is not related to any specific language(Java, C++ etc), it is a data structure for a O(1) constant time access to an element (at least in the best case). The article describes that we keep the info about the first occurence of the k-mer in the hashmap. What I don't get is why we would need this information for a traceback? I can assemble the sequence by just following the arcs and writing down the k-mers. Why would I need an information about the reads which are represented by those k-mers after the graph is already constructed. Is it meant that the hashmap is needed for the construction itself and only? (question to all who might know)
        Are you going to put this presentation up somewhere?
        Of course. This is my seminar presentation at the Uni.
        Any association between reads and their kmers is for the purposes of paired-end resolution and read usage statistics.
        It can't be used for the usage statistics, since the hashmap contains the information about only the first read where certain k-mer is found. There might be several reads with the same k-mer, but at our disposal is the information of the location of only one such read.

        Intuitively I think that it is done to link up all the reads which have such k-mer. Read set is analyzed one-by-one and each k-mer is added to the hash map in form of the id of the first read where it was found. Any subsequent requests in another reads for the storage of the same k-mer are denied. Afterwards when all information is stored we walk all reads again. Each time k-mer of some read is retrieved it is being looked up in the hashmap and there we find the id of the read where it was found for the first time so we can link these reads. The same is done further. We get such one-to-many correspondance. That's what I assume from the paper since it is stated unclear in it but I can't present my assumptions on the slides.
        Last edited by bioinf; 01-06-2011, 10:57 AM.

        Comment

        • bioinf
          Member
          • Nov 2010
          • 25

          #19
          If going back to the biological details. Could you please explain how repeats in the DNA lead to the gaps between contigs? Yes they are overlapped although they shouldn't be, but how does it lead to "gaps"? Since velvet cuts all tips longer than 2k, then whenever a repeat with a big portion of sequence after it is overlapped to the k-mer which was found earlier such "tip" will be discarded.
          Last edited by bioinf; 01-08-2011, 11:31 AM.

          Comment

          • parit
            Junior Member
            • Jan 2011
            • 2

            #20
            @bioinf: I am not sure I fully get your question but here are my two cents. If there is a repeat then either there will be a node reported with a coverage higher than the expected coverage or there will be a loop. In the later case, assembler, while making contigs, dont know the frequency of the repeat and hence cannot connect the contigs to the right and left of the repeat and therefore report them as 2 different contigs with a gap in between...
            As far as the tips are concerned, I couldnt connect "tips" with "repeats" as I thought tips occur when there is a sequencing error at the end of the read. It has nothing to do with repeat.
            Please do correct me if I am wrong as I am also trying to understand the logic of velvet.
            Can you also post your presentation or email me?

            - Parit

            Comment

            • Zigster
              Jeremy Leipzig
              • May 2009
              • 117

              #21
              yes please post it
              --
              Jeremy Leipzig
              Bioinformatics Programmer
              --
              My blog
              Twitter

              Comment

              • boetsie
                Senior Member
                • Feb 2010
                • 245

                #22
                For repeats, you can have a look at his dissertation

                We train scientists at all levels to get the most out of publicly available biological data.


                See Chapter 4. Hope this makes it more clear.

                Boetsie

                Comment

                • Zigster
                  Jeremy Leipzig
                  • May 2009
                  • 117

                  #23
                  Is this presentation available?
                  --
                  Jeremy Leipzig
                  Bioinformatics Programmer
                  --
                  My blog
                  Twitter

                  Comment

                  • parit
                    Junior Member
                    • Jan 2011
                    • 2

                    #24
                    dude seem to have vanished :O hope presentation went fine.

                    Comment

                    • Jenzo
                      Member
                      • Mar 2011
                      • 31

                      #25
                      Hey guys,
                      was anyone able to compile Velvet 1.1.04, released yesterday by D. Zerbino?

                      Code:
                      src/readSet.c:34: fatal error: zlib.h: File or directory not found
                      compilation terminated.
                      Hope someone has an idea, thanks a lot!

                      Edit: Problem is solved, thanks a lot!
                      Last edited by Jenzo; 05-20-2011, 12:40 AM. Reason: Problem solved

                      Comment

                      • nilshomer
                        Nils Homer
                        • Nov 2008
                        • 1283

                        #26
                        Originally posted by Jenzo View Post
                        Hey guys,
                        was anyone able to compile Velvet 1.1.04, released yesterday by D. Zerbino?

                        Code:
                        src/readSet.c:34: fatal error: zlib.h: File or directory not found
                        compilation terminated.
                        Hope someone has an idea, thanks a lot!

                        Edit: Problem is solved, thanks a lot!
                        So what was the solution?

                        Comment

                        • Thorondor
                          Member
                          • Feb 2011
                          • 69

                          #27
                          you can copy the *.o files in third-party/zlib-1.2.3 from an older velvet version. I am pretty sure that they did not changed.

                          Comment

                          • dp05yk
                            Member
                            • Dec 2010
                            • 66

                            #28
                            Originally posted by nilshomer View Post
                            So what was the solution?
                            I'm going to hazard a guess that they had to either install zlib or modify the makefile to link up correctly.

                            Comment

                            • Jenzo
                              Member
                              • Mar 2011
                              • 31

                              #29
                              Daniel Zerbino wrote today:
                              Dear all,

                              my sincere apologies for the compilation bug which was lying in the
                              recently updated code. I have just updated the repositories. Thanks to
                              Sylvain Forêt for quickly correcting it.
                              [...]
                              Regards,

                              Daniel

                              Comment

                              • Thorondor
                                Member
                                • Feb 2011
                                • 69

                                #30
                                yup Jenzo, also did get this email, but the oases compilation bug "src/readSet.c:34: fatal error: zlib.h: File or directory not found compilation terminated." is still there. ;-)

                                Comment

                                Latest Articles

                                Collapse

                                • SEQadmin2
                                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                  by SEQadmin2


                                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                  ...
                                  06-02-2026, 10:05 AM
                                • SEQadmin2
                                  Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                                  by SEQadmin2


                                  With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                                  Introduction

                                  Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                                  05-22-2026, 06:42 AM
                                • SEQadmin2
                                  Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                                  by SEQadmin2

                                  Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                                  Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                                  05-06-2026, 09:04 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by SEQadmin2, 06-02-2026, 12:03 PM
                                0 responses
                                21 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 06-02-2026, 11:40 AM
                                0 responses
                                14 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 05-28-2026, 11:40 AM
                                0 responses
                                29 views
                                0 reactions
                                Last Post SEQadmin2  
                                Started by SEQadmin2, 05-26-2026, 10:12 AM
                                0 responses
                                31 views
                                0 reactions
                                Last Post SEQadmin2  
                                Working...