Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Low coverge genome assembly- suggestions

    Hello,

    We have sequenced genomic sequence of fruit crop through Hiseq 2000. The data are illumina paired-end fastq reads. The raw reads are filtered using trimmomatic with default settings and test with FastQC tool. After filteration, the total sequences are reduced from 505956290 to 418812062, with %GC 38 and sequence length is 101.It passed all test with warnings in per base sequence content and sequence duplication levels.

    My single filtered fastq file size is 108Gbp and the genome size predicted through kmer genie and SGA preqc predicted to be around 2Gbp. The coverage is to be below 20x. Which genome assembler is good in assembling at low coverage?. What are the ways I can improve my genome assembly through computational approach?. Please let me know your suggestions and any pointer to journal papers which successed in assembling low coverage plant genome.

  • #2
    Do you only have one single short fragment library which makes up these 20x coverage or is this a sum of different libraries?

    If you want to assemble genomes with short read technologies it is crucial to have several libraries and library types of different insert (and maybe also read) lengths.

    Is it not possible for you to sequence more or you really need to make something out of these 20x?

    Comment


    • #3
      Hi - I would first consider running error correction of your reads (e.g. using musket). Are your reads paired? This would be important to improve the assembly.SOAPdenovo could be a good starting point, you could also try abyss, velvet, that are also relatively easy to install and run, though velvet could be quite memory demanding for a big dataset as yours. It is important to optimise the k-mer size, kmer genie should have suggested one already. However, with the coverage you have, you cannot expect a really high N50. Hope this helps.

      Comment


      • #4
        I have paired-end reads (read1.fastq, read2.fastq), which I interleaved as single.fastq file. This single fastq file has %GC 38 and sequence length is 101. This file has coverage about 20x. I cannot able to sequence more due to my boss budget, I would like to make something out these reads to make publication. Any suggestions, to make draft genome for publication.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Essential Discoveries and Tools in Epitranscriptomics
          by seqadmin




          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
          04-22-2024, 07:01 AM
        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Today, 08:47 AM
        0 responses
        11 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        60 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        59 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        54 views
        0 likes
        Last Post seqadmin  
        Working...
        X