Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Extract paired end reads from sff file.

    Hi,

    I'm working on a genome assembly project from 454 paired end reads which are in sff format. I have already tried Newbler which works with sff files directly, however now I want to try other things as well, so I need to have reeds in a user friendly format like fasta.

    I used sff_extract whith -l and -c options to remove linkers and clip the ends. I decided to verify the resulting fasta file and assembled it with Velvet as single end data, however the results were very strange: N50 was about 15-30 bps. This is very strange given that the coverage is decent (~30) and single end reads assembly by Newbler was good. So I assume that maybe the extraction process went wrong and something else has to be clipped from the sequences? Any suggestions how to do it? Or did I do something wrong?

    I was thinking to map the resulting fasta reads to Newbler contigs to see what is wrong with the reads. But I have not worked with alignment tools yet. I would appreciate any suggestions on how to align reads to contigs (preferably with visualization).

    Thanks

  • #2
    You could also run a newbler assembly with the '-tr' flag set, this will give you all trimmed reads as a separate output file. The paired end reads will be split in two (with linker removed, and clipped). Paired reads can be recognized by the '_left' and '_right' at the end of their names.

    Perhaps this helps?

    Comment


    • #3
      flxlex, thanks! This is exactly what I was looking for.

      Comment


      • #4
        What is the command line? I got GS2.6 here, and I have not run with command lines yet. In the path, I have list of commands - newbler, newAssembly, gsAssembly, runProject .... Which one I should use for extracting paired end reads from sff file?

        Thanks,

        Justin


        Originally posted by flxlex View Post
        You could also run a newbler assembly with the '-tr' flag set, this will give you all trimmed reads as a separate output file. The paired end reads will be split in two (with linker removed, and clipped). Paired reads can be recognized by the '_left' and '_right' at the end of their names.

        Perhaps this helps?

        Comment


        • #5
          runAssembly -o some_name -tr path/to/454reads.sff

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin


            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
            Yesterday, 07:01 AM
          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          37 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          41 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          35 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          54 views
          0 likes
          Last Post seqadmin  
          Working...
          X