Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Metagenomics analysis steps

    Dear All

    I am currently working on shotgun metagenomics data. I must tell i am very new to this work.

    I have PE with 100 bp length from Illumina platform.

    My aim in to find the functional annotation and also see differentially expressed bacteria in two groups
    The steps i am following are below with some question associated with each step.
    1) Trimming with trimmomatic but I read in review Article that LUCY is good for metagenomics data trimming.
    Q: Can anyone suggest me if there is metagenomics specific trimming program other than Trimmomatics/Prinseq? What is in general average quality value considered?
    2) Metagenome assembly
    a) Denovo assembler: MetaVelvet and Meta-IDBA, RAY, SOAP, Celera, Eular ( to compare all)
    Q: how to check which assembler is giving better contigs. Is there any tool which checks for output of an assembler?
    b) Reference based assembly: MIRA, AMOS, Genometa
    Q: I have no idea how it works(on contigs or reads or blast output)? Do we need to blast all the reads with all the bacterial references?
    Or these tools have their own reference database? Or to do it with bowtie against the reference? Or if someone have better suggestions?
    3) Binning:
    a) Sequence similarity based binning: MEGAN, IMG/M , MG-RAST
    Q: what is the difference between this and reference based?
    b) composition-based binning: GroopM and Concoct, Phylopythia
    c) Works on both methods: PhymmB and MetaCluster
    Q: suggest what is better tool for binning host associated data?
    4) Gene prediction tools: MetaGeneMark and Glimmer-MG,mORFind
    Q: suggest what is better tool?

    5) Q: How can we do functional annotation?

    Cheers,
    Last edited by naman; 10-01-2015, 04:04 AM.

  • #2
    Originally posted by naman View Post
    1) Trimming with trimmomatic but I read in review Article that LUCY is good for metagenomics data trimming.
    Q: Can anyone suggest me if there is metagenomics specific trimming program other than Trimmomatics/Prinseq? What is in general average quality value considered?
    Trimming programs are not specifically better for metagenomics or RNA-seq or whatever, like assemblers are. I recommend BBDuk, which is fast and performs well.

    2) Metagenome assembly
    a) Denovo assembler: MetaVelvet and Meta-IDBA, RAY, SOAP, Celera, Eular ( to compare all)
    Q: how to check which assembler is giving better contigs. Is there any tool which checks for output of an assembler?
    I suggest Megahit, which we currently use for all production metagenomic assemblies. You can use Quast to evaluate the output, though since you don't know the correct answer, it's of limited use. To quickly get continuity statistics, the BBMap package includes a tool called stats.sh which was specifically designed to scale with huge metagenomes of hundreds of gigabases (assembled). Usage: stats.sh in=contigs.fa

    b) Reference based assembly: MIRA, AMOS, Genometa
    Q: I have no idea how it works(on contigs or reads or blast output)? Do we need to blast all the reads with all the bacterial references? Or these tools have their own reference database? Or to do it with bowtie against the reference? Or if someone have better suggestions?
    There's typically no point in a reference-guided assembly of a metagenome...

    Comment


    • #3
      Hi,
      Thanks for the reply!
      At the moment i was trying with reference based assembly with MIRA. I read the manual and i believe providing strain parameter will align the reads to the references (please correct me if i am wrong).
      But i have metagenomic data from mouse feces which can have multiple strains. So do I need to make my own input strain file or how does it works?
      I know that reference based assembly may not be the best one for metagenomic data but still want to compare it with denovo one.

      Comment


      • #4
        Hi Again,
        I have a question related to kmer optimization for meta-genomic assembler.
        How is the kmer optimized? Since its not a single genome so how to decide which kmer to use.
        Is there any tool or community standard to optimize kmer specific to meta-genomics? Currently i am running Ray and metavelvet with k21 to k63 and i am not understanding how to optimize kmer for my meta-genomic assembly which is for mouse feces.
        Thanks and looking forward for reply!!

        Comment


        • #5
          Unfortunately it will come down to trial and error for metagenomes. Each sample type will have a different optimal kmer based on species diversity.

          You can use a program like kmergenie, but even the recommended optimal kmer may be off. It would be best to look through the histograms that are produced and pick a few that span a range.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM
          • seqadmin
            Techniques and Challenges in Conservation Genomics
            by seqadmin



            The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

            Avian Conservation
            Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
            03-08-2024, 10:41 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, Yesterday, 06:37 PM
          0 responses
          10 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, Yesterday, 06:07 PM
          0 responses
          10 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-22-2024, 10:03 AM
          0 responses
          51 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-21-2024, 07:32 AM
          0 responses
          67 views
          0 likes
          Last Post seqadmin  
          Working...
          X