Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Split barcodes - embedded index only in forward read

    Hi,
    I try to split fastq file containing multiple samples (illumina paired end, miseq run). These are not general barcodes read directly on the machine - they are simply embedded within the read (forward only, first 6bp, reverse has no index).
    Tried to use FASTX toolkit but there is no way to use it with paired end.
    Managed to successfuly split the read1 using FASTX but how to match read2 reads ?
    Is there any tool to directly work on PE reads?
    Thanks in advance!
    b

  • #2
    Your problem is one I image a lot of people have yet there isn't a straight forward way of doing what you ask (that I am aware of).

    One way to do it is to filter you R1 and then interlace each filtered file with its matching R2. For this I use FastqInterlacer/de-interlacer implemented in Galaxy. You will need to install a local instance of Galaxy and then add this functionality via their toolshed.

    Another way that may work is to use the RADseq program STACKS. The QC step in STACKS (process_radtags) splits paired end reads by barcodes in R1. Just make sure you switch off all off the other QC filtered to keep all of your data.

    Comment


    • #3
      Contribute to najoshi/sabre development by creating an account on GitHub.

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin




        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
        04-22-2024, 07:01 AM
      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Today, 08:47 AM
      0 responses
      12 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      60 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      59 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      54 views
      0 likes
      Last Post seqadmin  
      Working...
      X