Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • einarr
    Junior Member
    • Nov 2010
    • 3

    Bacterial genomes with Illumina - what is the best option for good assembly?

    Hi,

    I've got about 100 strains of E. Coli that should be sequenced and assembled. What is the best option for getting semi-complete genomes? My current thougts are to use >=100bp single read + 5kb mate pair.

    As E. coli strains can differ quite a bit and we want to look at the differences between these strains both at genome (organizational) level and more local differences, I would guess that de novo assembly would be preferable if it can be done. Does anyone have experience with assembly of something like this? How is the quality of the genome compared to what you get with 454?

    Cheers,
    --
    Einar Ryeng
  • MadsAlbertsen
    Member
    • Aug 2010
    • 26

    #2
    I would do PE 2x100 and a Matepair of 5Kb.

    In my experience it is atleast as good if not better than 454 (only done 1x454 genome..).

    rgds
    Mads Albertsen

    Comment

    • einarr
      Junior Member
      • Nov 2010
      • 3

      #3
      Originally posted by MadsAlbertsen View Post
      I would do PE 2x100 and a Matepair of 5Kb.
      Thanks for the input. I was on the same track (paired end) for a moment. Guess I should go back there then.

      You don't have any thoughts on coverage as well? I'm thinking that 100x would do on the PE, but don't really know what to go for on the MP library.

      Originally posted by MadsAlbertsen View Post
      In my experience it is atleast as good if not better than 454 (only done 1x454 genome..).
      Sounds good.

      Thanks,
      --
      Einar Ryeng

      Comment

      • pmiguel
        Senior Member
        • Aug 2008
        • 2328

        #4
        We did 3 Salmonella strains using v3 HiSeq chemistry. These were resequences so assembly was not required. But we did one anyway. ABySS-PE v 1.3.0, with scaffolding turned on assembled these 2x101 reads. N50 contigs lengths (counting only those > 1 kb) was 248K-283K. 60 or less scaffolds (again, counting only scaffolds longer than 1 kb) in each assemble. Oh, this was with kmer set at 80.

        Actually we had some difficulty controlling the numbers of reads so we were probably out beyond 200x. So the high kmer setting may have had the effect of reducing the input coverage to something the assembler would handle better.

        Anyway, PE only. Not even that large inserts (~250 bp).

        --
        Phillip

        Comment

        • MadsAlbertsen
          Member
          • Aug 2010
          • 26

          #5
          We've done anything from 200x-2000x (depending on if we can fill the machine..) and I do not see much difference in the assembly above 300x.

          We do not use matepairs normally as we are rarely interested in "complete" genomes. Using 2x100 PE with an insert size of approximately 300 bp the assembly (#contigs,N50) is more or less proprotional to the repeat content of the genomes using the new HiSeq chemistry.

          I guess 200x PE and a low coverage matepair (25-50x?) would be fine for denovo assembly.

          If you have loads of DNA (~5 ug/sample) you could keep the PCR cycles very low and go for even lower coverage due to less GC variation.

          However, a single HiSeq flowcell is around 300Gb and I would just fill that with 96 genomes if it was me e.g. 12 pr lane = average coverage of 625 then you'll have plenty of room for concentration variation between your samples.

          rgds
          Mads

          Comment

          • stevebaeyen
            Member
            • Aug 2011
            • 18

            #6
            We just finished a bacterial genome of 4.2Mb with 91x coverage 50bp paired-end reads and 60x 75bp 5kb mate-pair reads. After the paired-ends, we obtained 480 contigs (>200bp) after denovo assembly. Then we scaffolded with the mate-pairs with SSPACE2 Premium with the bwa aligner and obtained 56 scaffolds at 150x coverage average. Then we used SOAP Gapcloser to fill the gaps and did a second round of SSPACE2/gapcloser and obtained 15 scaffolds, min contig size 0.5Mb, max contig siz 1.25Mb

            Comment

            Latest Articles

            Collapse

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by SEQadmin2, 06-09-2026, 11:58 AM
            0 responses
            24 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-05-2026, 10:09 AM
            0 responses
            29 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-04-2026, 08:59 AM
            0 responses
            39 views
            0 reactions
            Last Post SEQadmin2  
            Started by SEQadmin2, 06-02-2026, 12:03 PM
            0 responses
            61 views
            0 reactions
            Last Post SEQadmin2  
            Working...