Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Steps prior and during de novo assembly (clcbio)

    Dear all,

    I have some questions considering steps prior a de novo assembly. I have normalized cDNA Miseq (pair end) data from two marine nematode species (no reference genome available of any marine nematodes) which I want to assembly to create a transcriptome. The sequencing company has done some things for me already:

    1. Quality trimming: We trim low quality ends (< Q20) with FastX 0.0.13 [1].
    2. Adapter trimming: The adapters are trimmed only at the end (at least 10bp
    overlap and 90% match) with cutadapt 1.2.1 [3].
    3. Quality fltering: Using FastX 0.0.13 and ShortRead 1.16.3, we remove in
    succession small reads (length < 50 bp), polyA-reads (more than 90% of the
    bases equal A), ambiguous reads (containing N), low quality reads (more than
    50% of the bases < Q25) and artifact reads (all but 3 bases in the read equal one
    base type).
    4. Making pairing consistent: Filtering reads may remove one read of a pair and
    make paired fastq-?les inconsistent. In this step we remove reads that belong
    to broken pairs and save them in separate fastq
    5. Removal of contaminants: Using bowtie 2.0.0-beta5, we identify reads that
    align to phixillumina and remove them.

    So here it ends and I step in. I have uploaded my sequences in CLCbio and trimmed the sequences for the cDNA adapters, which were required to amplify my normalized cDNA libraries to increase the amount of cDNA.

    My questions are:
    - Prior to a de novo assembly there is the option to merge pair end reads giving two data sets: one with merged sequences and one without. Is it a good option to merge paired end reads or should the de novo assembly start from the original fastq files? Or should we do both, merging the pair end data and using these merged sequences together with the original data for my de novo assembly?

    - During de novo assembly there is the option of scaffolding. I'm not sure whether this option is good. It indeed will create longer contigs but does it give downstream problems during annotation. I mean: If two genes are in very close proximity (or even on oposite strands) there is a possibility that they will end up in 1 contig. When blasting this contig won't you miss 1 of the 2 genes?

    - How is it possible that when mapping reads back to the transcriptome 10% was not mapped?


    Thanks in advance

  • #2
    Hi, I have exactly the same question, did you find the answer ?

    Comment


    • #3
      hi Jevcampe and rafaelbsvaladares

      Can you suggest me the library construction (e.g. 2K, 500pb) used for your illumina cDNA sequencing.

      As I am newly working on Metatranscriptomics, I have done RNA isolation and enrichment and its cDNA conversion. Now I want to sequence it with illumina Hiseq and for that I need to tell company the library I want to use for sequencing.

      Thanks in advance

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Recent Innovations in Spatial Biology
        by seqadmin


        Spatial biology is an exciting field that encompasses a wide range of techniques and technologies aimed at mapping the organization and interactions of various biomolecules in their native environments. As this area of research progresses, new tools and methodologies are being introduced, accompanied by efforts to establish benchmarking standards and drive technological innovation.

        3D Genomics
        While spatial biology often involves studying proteins and RNAs in their...
        01-01-2025, 07:30 PM
      • seqadmin
        Advancing Precision Medicine for Rare Diseases in Children
        by seqadmin




        Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
        12-16-2024, 07:57 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 01-09-2025, 04:04 PM
      0 responses
      431 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 01-09-2025, 09:42 AM
      0 responses
      440 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 01-08-2025, 03:17 PM
      0 responses
      452 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 01-03-2025, 11:18 AM
      1 response
      50 views
      1 like
      Last Post Tonia
      by Tonia
       
      Working...
      X