Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Minimum amount of reads for de novo plant transcriptome assembly

    I have Illumina short read, 2X150bp right now, around 12,3509 Gb data.
    I just curious whether got any parameter or formula able to calculate the minimum short read required to assemble a transcript sequence by transcriptome assembler program in order to obtain comprehensive transcript?
    eg. must have at least 1Mb Illumina short read in order to assemble it.

    Do we need consider coverage and depth of data when determine or calculate the minimum short read required for transcriptome assembly as well?

    Thank you!

  • #2
    Didn't you already post on this topic?

    There are too many variables involved and missing info from your questions.
    Are you saying you have 12,3509 Gb data right now?
    Why don't you just assemble it?

    " calculate the minimum short read required to assemble a transcript sequence by transcriptome assembler program in order to obtain comprehensive transcript?"

    This obviously depends on the species. Are you sequencing total RNA or poly A/T targeted?
    This makes a big difference!

    How many genes are in your species?
    What is the ploidy?
    Do you expect many paralogues?
    What depth do you want to sequence?
    Do you have a reference?
    How many individuals do you want to sequence?

    If I was de-novo sequencing a species with no prior information I would use longer reads of 454, and then fill in the gaps and depth with Illumina.

    I don't understand how you expect to calculate the number of sequences you need to get a "good representation" of the transcriptome when you have no expectations ??


    There are ways to estimate genome size, this may give you some idea of how many genes to expect..

    Why do you want to examine the transcriptome? What are your specific questions?

    Comment


    • #3
      Ok, first what i'm studying is the denovo transcriptome of Aristotellia chilensis, there is no reference here, so for that i have five sequencing results of "MiSeq" from Illumina, each of the sequencing whas performed right this:
      1º sequencing: half-ripened and mature tissues of Aristotellia chilensis
      2º sequencing: half-ripened and mature
      3º sequencing: green, albino and leaf
      4º sequencing: green, half-ripened, mature and albino
      5º sequencing: green, half-ripened, mature and albino

      So, the total amount of information of all these sequencing data is 12,3509 Gb.
      What i'm trying to know is the total number of reads required to generate a good assembly for transcriptomic plant information!

      Comment


      • #4
        It's best not to post multiple threads asking the same question.

        There is no specific number of reads that is enough. It depends on the structure and repetitiveness of the genome of your species, and many other factors you can't necessarily measure. The best thing is just to try assembling it.

        For what it's worth, I've had excellent assemblies from 4GB of paired-end 100bp Illumina reads, and I've had terrible assemblies from 400GB of similar reads from a different species with a higher ploidy genome. The number of reads tells you nothing about how good the assembly will be.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM
        • seqadmin
          Techniques and Challenges in Conservation Genomics
          by seqadmin



          The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

          Avian Conservation
          Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
          03-08-2024, 10:41 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Yesterday, 06:37 PM
        0 responses
        11 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, Yesterday, 06:07 PM
        0 responses
        10 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-22-2024, 10:03 AM
        0 responses
        51 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-21-2024, 07:32 AM
        0 responses
        67 views
        0 likes
        Last Post seqadmin  
        Working...
        X