Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • FastQ quality trimming/keeping the reads paired

    Hi everyone,

    my name is Max and I'm a newbie in de novo assembling genomes. I have often read this forum (+ a few tons of literature and I am trying to step up a bit now.

    I am trying to use Velvet/Metavelvet on some fastq paired ends data, and I would like to do some quality trimming on my data before feeding it to the assembler. The problem is that Velvet requires all reads to be in order and paired if the input is a fastq/fasta file.
    If I filter the reads by quality, I am afraid not all the reads would be paired anymore.

    Would a conversion of the unaligned reads into a sam file keep the information regarding pairing? Velvet does work with sam, and in this case it does not requires all the reads to be paired. I have read a bit about sam format, but I am not sure what I'm saying makes sense... does it?

    Thanks for all the time you spend answering and helping on this forum!
    Max

  • #2
    Originally posted by mstagliamonte View Post
    If I filter the reads by quality, I am afraid not all the reads would be paired anymore.
    There are many posts on this forum about how to keep pairing when doing quality filtering. Most of the posts talk about Trimmomatic which is a good keyword to search on.

    Would a conversion of the unaligned reads into a sam file keep the information regarding pairing?
    Yes the pairing information can be kept. The sam format has flags as to if 'the read is paired in sequencing' and 'the read is the first read in a pair'.

    Comment


    • #3
      Thanks!

      Sorry for posting something that had been already asked. I'll try to do a more extensive search next time


      Have a nice day
      Max

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM
      • seqadmin
        Strategies for Sequencing Challenging Samples
        by seqadmin


        Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
        03-22-2024, 06:39 AM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      22 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      24 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      20 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      52 views
      0 likes
      Last Post seqadmin  
      Working...
      X