Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Assess an illumina metagenome assembly

    I've just finished my first shot at assembling a metagenome sample derived from a marine environment using Velvet, VelvetOptimizer and Metavelvet. Running the assembly with the recommendations from VelvetOptimizer (kmer size of 31), I was getting very short contigs ~100-400 bp. Increasing the kmer size beyond this to 37 resulted in longer contigs 100-3K. When I started analyzing the coverage and reads for several contigs, I noticed that some short contigs i.e. 90bp had about 10K reads and a coverage of 20X for example. Some of the longer contigs 1K or 2K had less reads (200) and lower coverage. My question is if there are certain steps I can take now to filter out contigs that might not be useful in the annotation step for example, or that could be chimeric? I'm not sure how to further proceed. Before I try using a different assembler I'd like to understand how these results could be further analyzed before proceeding to the annotation step.

  • #2
    A 90 bp contig?
    Just filter everything which is (depending on your taste) either smaller than 2 * read size or smaller than expected average gene size (or just 1000 bp).
    Because either is unlikely to provide much info.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Essential Discoveries and Tools in Epitranscriptomics
      by seqadmin


      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
      Yesterday, 07:01 AM
    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    39 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    41 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 09:21 AM
    0 responses
    35 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-04-2024, 09:00 AM
    0 responses
    55 views
    0 likes
    Last Post seqadmin  
    Working...
    X