Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Assembly (velvet) of mate-pair data from Illumina

    Hello,

    I asked a similar question previously, but wanted to re-post to more specifically address the question of assembly (with velvet) rather than alignment.

    I was writing to ask a question of those who have, seemingly successfully, assembled to fastq's with Illumina mate pair data. I have seen other threads in which people mention that reverse complementing the reads is a necessary prerequisite to ensure the reads are facing the correct direction. Is this true? If so, is there a tool that someone could recommend to easily reverse complement a fastq or sequence.txt file?

    If you have any other suggestions for dealing with mate pair data and have suggestions of tools to accomplish those tasks, I'd greatly appreciate it.

    Cheers,
    John

  • #2
    There are a number of options for reversing the reads but I prefer to use the fastx toolkit as it's the fastest option. I also like biopython and you can look into the biopython tutorial for examples.

    From my own data, the biggest headache with mp data is the contamination of long inserts with short inserts, which are in the opposite read orientation. I haven't seen a good solution yet as to how to handle this. I have a somewhat related species that I can use as a reference to initially map the reads and extract those pairs that are correctly mp or short insert contamination but this has its own issues such as assuming that there are no rearrangements etc. It also still leaves most reads unmapped.

    Comment


    • #3
      Dear natstreet,

      I'm currently wondering if this is a problem I'm having as my mate pair Illumina data is not assembling as all with Velvet, although soapdenovo seems to be working ok. Why would you get short insert contamination in a materpair library ?

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin



        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified...
        Yesterday, 07:01 AM
      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      55 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      52 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      45 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      55 views
      0 likes
      Last Post seqadmin  
      Working...
      X