Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • velvet N50

    Hi ,Iam working with velvet denovo assembly of illumina reads.Initially i trimmed the raw reads of illumina based on quality and after running the velvet with the subset of the reads the N50 is found to be always low , It is like 29 ,20 ..I tried with various parameters , nothing improved the N50.So any suggestions???

  • #2
    Important parameters for velvet: K-mer, exp_cov, cov_cutoff.... you could play with these three to get a better N50...

    Comment


    • #3
      what is your read length? and the estimated genome length? if the sequencing coverage is very high (>=200) and uneven, the N50 is very slow for Velvet because of the sequencing errors and SNPs , even though you used subset and set important parameters, k-mers, exp_cov and cov_cutoff.
      Last edited by gridbird; 02-09-2011, 11:25 AM.

      Comment


      • #4
        as gridbird already stated, we need more information to help you. ;-)

        if you have a high coverage you can choose a high kmer and use cov_cutoff to remove contigs with low coverage which are normally small.

        Comment


        • #5
          We have this problem too: we had an excellent run for our samples (chloroplast genome) but assembly with Velvet gave N50 lower than read lengh (36). We played with all Velvet parametres but maximum N50 was 29. Where may be a problem?

          Comment


          • #6
            how many reads you have? how high is you estimated coverage? which kmers you tried? how many contigs you get? do you get some long contigs? what coverage is stated in the ids of the contigs?

            i really can't say where the problem might be with the amount of information you stated.

            anyway the best way to address velvet problems is over the mailinglist:


            zerbino is also very active on this mailinglist.

            Comment


            • #7
              17 mln reads from one lane from one end. Coverage near 200 (but may be contamination from nuclear genome). Length of reads 36. k-mers from 23 to 31. Number of contigs from 300 to 2500. N50 12 - 21. We try with 1 mln and 100 ths reads, but the result was few better (N50 54). Maximum contigs length near 100 nucleotides. What the "ids"?
              (Data from 454 from this material gave a chloroplast genome map.)
              Last edited by vtosha; 02-25-2011, 06:07 AM.

              Comment


              • #8
                i mean the tag (id) of the contigs. they are like: >NODE_length_xxxxx_cov_xxxxx.xxxxxx, so you can check what coverage velvet assigns to the contigs.
                did you also set the parameter -unused_reads yes, to check how many reads velvet does not use? Do you do any quality trimming before using velvet?

                A coverage near 200 should give you better results, there seems to be something wrong. :-/ did you tried another assembler?

                Comment


                • #9
                  Coverage in the ids of contigs near 1000.
                  We didn't set parameter -unused reads by ourselves. But velvet write how many reads it use: 100 ths-1 mlns from 17 mlns reads. When we use for assembly 1 mlns or 100 ths reads: 82 ths used from 1 mln, 1000 from 100 ths. No, we didn't trim reads.
                  We try Edena - no good results (contig 134 nucleotides and no BLAST to anything).
                  No BLAST to adapters or primers.
                  May the problem be in abundant PCR?

                  Comment


                  • #10
                    well when you velvet use only 1mln out of 17mlns there seems to be quality issue about the reads. And when you contigs have a cov around 1000 it looks like they are from repetitive sequences.

                    Comment


                    • #11
                      I think it is the sequencing error which give you this problem. There is no good N50 for Velvet with high coverage and sequencing error. did you check velvet paper? for error free reads, Velvet can always get good N50 no matter how much high coverage. But for real reads, N50 will drop with coverage which is caused by sequencing error. you can randomly selected several coverage, such as 10,50,100, 150, 200 and assembly them using velvet and pick up a good N50.
                      did you try the error correction program, such as shrec, quake, to correct sequencing error before assembly? also, you can use Solexaqa to trimmed some reads with low quality before assembly.

                      Comment


                      • #12
                        velvet

                        Hi guys,
                        I have been working on an yeast strain. I have raw paired end illumina reads. Is there any way to find out the read length and coverage from the sequence data itself? I am trying to use velvet for assembly of the genome. However, i am new to velvet and have certain queries on the same. What is the optimization criteria for k mer length? Secondly, the contigs obtained after running velvetg, how it can be further used to generate full genome sequence of the organism? how can further genes be predicted from the sequence?

                        Comment


                        • #13
                          velvet N50

                          You can get the read length by having a look at your fastq files, and
                          FastQC will also give you the read length.

                          There is a script called velvetk that will calculate kmer coverage from your fastq files before you run velvet.

                          See


                          You may also find Velvet Optimiser and Velvet Advisor useful.



                          Comment

                          Latest Articles

                          Collapse

                          • seqadmin
                            Strategies for Sequencing Challenging Samples
                            by seqadmin


                            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                            03-22-2024, 06:39 AM
                          • seqadmin
                            Techniques and Challenges in Conservation Genomics
                            by seqadmin



                            The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                            Avian Conservation
                            Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                            03-08-2024, 10:41 AM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by seqadmin, Yesterday, 06:37 PM
                          0 responses
                          10 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, Yesterday, 06:07 PM
                          0 responses
                          9 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 03-22-2024, 10:03 AM
                          0 responses
                          49 views
                          0 likes
                          Last Post seqadmin  
                          Started by seqadmin, 03-21-2024, 07:32 AM
                          0 responses
                          67 views
                          0 likes
                          Last Post seqadmin  
                          Working...
                          X