Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #61
    Originally posted by usad View Post
    Did you do random trimming or did you trim them down to the region with the highest information gain (which is what we do).

    I think it had large genomes in mind. It is really good in RAM consumption and quite ok in thread usage and thus speed. Maybe CLC4 brings some scaffold capabilities?

    Cheers,
    björn
    hi, björn
    I totally agree with you that CLC is good at CPU and RAM control. But, i just wondering that why it doesn't support scaffolding which is important for genome assembly.

    BTW, could you please tell me what is the region with highest information gain? Now i am just trim the reads randomly. But, i am thinking to use a sliding window to scan the region with less homopolymer(for example, the length of the homopolymer is < 4).

    Comment


    • #62
      Hi

      just the most unambigous region. So if your whole 454 read aligns to one region and one region only to search for a short region in the read which also would allow placing it at this position only and not on another contig and use this as a pseudo-Illumina read.
      But it probably doesn't help too much it will jut give you a few more links. (Maybe simulate first what you can expect, based on your N50 /aveage length or length distribution, linker length quality and number)

      Cheers,
      Björn

      Comment


      • #63
        Originally posted by shaohua.fan View Post
        hi, björn
        I totally agree with you that CLC is good at CPU and RAM control. But, i just wondering that why it doesn't support scaffolding which is important for genome assembly.

        BTW, could you please tell me what is the region with highest information gain? Now i am just trim the reads randomly. But, i am thinking to use a sliding window to scan the region with less homopolymer(for example, the length of the homopolymer is < 4).
        Just for everyone who is using the Genomics Workbench to know - There have been numerous updates to the functionality. Much of which is available as a plug-in download. For the de novo assembler there are two significant changes. 1 - The assembler now supports scaffolding of paired-end reads. 2 - You now have the ability to change the k-mer value as one of your parameters.

        Happy Sequencing :-)
        Naomi

        Comment


        • #64
          I am currently assembling about 215 gigabases of sequence data with clc_novo_assemble. Should I expect clc_novo_assemble to print its progress while it is running? Its last output of progress (80%) was two days ago.

          Comment


          • #65
            Originally posted by lmilne View Post
            I am currently assembling about 215 gigabases of sequence data with clc_novo_assemble. Should I expect clc_novo_assemble to print its progress while it is running? Its last output of progress (80%) was two days ago.
            what does CLC support tell you? They are usually very helpful and responsive.

            Comment


            • #66
              CLC bio Support

              Please, whenever you have questions about the behavior of the CLC bio assemblers, contact [email protected]. You may also be interested in trying the new version of the assembler, if you have not already. The new algorithm scaffolds the paired end information.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Essential Discoveries and Tools in Epitranscriptomics
                by seqadmin




                The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                Yesterday, 07:01 AM
              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              55 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              52 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              45 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              55 views
              0 likes
              Last Post seqadmin  
              Working...
              X