Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • HiSeq 50bp metagenomic assembly

    Hi all. We've been using 454 data previously and are still relatively new to ngs analysis. Our current dataset is in the form of HiSeq fastq files, with each sample having 4 reads of about 40 million ~40bp sequences per read. These are environmental samples of complex/diverse communities. What software can I use to detect & remove the junk sequences, and assemble the genomes left in the remainder? I appreciate any advice. Thanks in advance.

  • #2
    What insert size are the libraries? Are these paired ends? Why did you stop with such short reads? What sort of hardware do you have available?

    When you say "4 reads", do you mean "4 runs" -- you'll confuse less if you restrict the term 'read' to a single string of sequence information off the instrument.

    Most assemblers should be able to do something with this, but it is a pretty small dataset for a complex community (40M reads of 40Bp is only 1.6Gb of data -- 3.2 if this is paired-end, so each such dataset is (for example). velvet is very popular; I'm a heavy user of Ray (particularly handy if you have access to a cluster).

    Comment


    • #3
      Thanks for the reply krobinson. These are single-end reads and short (50bp) to allow increased 'depth' while keeping down costs.

      I have a few hardware options for analysis: 1) a single PC (Ubuntu, bio-linux) with 8 cores (intel i7, 3.2 GHz) and 24 GB RAM. 2) a few small remote clusters (8-24 cores) 3) some large remote clusters that I have not used before (500 - 3500 cores, 1-2 GB / core.)

      Thanks for the correction: each sample was processed and run 4 times, with ~40million reads per run. And the reads are all 50bp. The library insert size I think would be 300 - 500 bp; I know it's the TruSeq DNA Library protocol, and from my quick search on TruSeq DNA prep kits, that's the insert size range.

      I'll take a look at Ray and Velvet (or is it Metavelvet?) I've also come across SOAPdenovo. Any tips on what to watch out for when using Ray? Or the others? Thanks for the help

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM
      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      31 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      32 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      28 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      53 views
      0 likes
      Last Post seqadmin  
      Working...
      X