Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • What do I do with my paired end reads after removing the adapters?

    I'm not sure if I am doing this right...

    I have paired end reads of Serratia m.

    Here are the steps I took so far:

    1. FASTQC report for every reads. Check to see if adapters are a source of contamination. I checked "Overrepresented Sequences" in order to see if there was an adapter or not. If there were no sequences that weren't labeled "No Hit", I leave the read alone.

    2. I used cutadapt to cut adapters, which left me with 2 of my adapter cut paired end reads as well as a file for single end reads.

    I did the same for scythe, except scythe didn't produce a single end reads file.

    3. I noticed that the pair end files were not organized properly by the header, so I made a script to correct this. My script takes 2 paired end reads and gives you an output of the 2 organized paired end reads file with a file containing all the single end reads that did not have a pair.



    What do I do now?

    I'm confused on what I should do with my single end reads obtained after using cutadapt and I am also confused on what I should do with my single reads obtained after using my script to organize my fastq files by the header.

    When I move onto the trimming stage, do I ONLY trim my paired end reads and just ignore the single end reads? Or do I trim my paired end reads as well as my single end reads?

    When I am looking for the snps of 1 replicate, do I map the paired end reads as well as any other single end reads onto the reference genome?


    Edit:

    TL;DR

    Are these the right steps to get in order to start mapping my reads?

    1. Cut adapters (gives SE file)

    2. Quality Trim (gives SE file)

    3. Organize pairs (gives SE file)

    So by the end of this whole process I am left with 3 SE files and 2 processed paired end reads, giving a total of 5 files.

    Do I need to do any quality trimming to the single end reads or do I just take all 5 of my files and map them to the reference?
    Last edited by prs321; 01-13-2014, 11:45 AM.

  • #2
    You could try using trimmomatic, which will do adapter trimming and quality trimming, and give you 4 output files ( 2 for paired reads, 2 for single reads), where the 2 paired read files are in the same order.

    What you do with the trimmed data afterwards depends in part on what software you choose to align/assemble the data. For example, some aligners will only use paired reads or single reads, but not a mixture of both, so you would have to run the paired reads and the single reads separately.

    Comment


    • #3
      skewer

      You may also try using skewer, which is dedicated to adapter trimming for paired-end reads. Visit http://sourceforge.net/projects/skewer/ for downloading.

      Comment


      • #4
        Originally posted by prs321 View Post
        When I am looking for the snps of 1 replicate, do I map the paired end reads as well as any other single end reads onto the reference genome?
        Yes, you need to map reads to the reference genome and do variant calling to find SNPs.

        Comment


        • #5
          yes, you will map the paired ends. I have read some forums suggesting that you also map the singletons separately (where one read in a pair did not pass qc filters) but I personally have never done this. In our lab and our collaborating labs we just deal with the pairs and have been getting good results. but I guess it's up to you.

          So you will map the cut/trimmed/organized paired end fastqs (you will have them in two separate files, one for each direction) using the mapping software of your choice. Once the paired end fastqs are mapped you will have ONE sam file that combined the mapping of the paired reads. You will continue on to the rest of your pipeline and variant calling using that one resulting sam file. What you do with the singleton reads is up to you. good luck!

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM
          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          25 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          27 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          24 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          52 views
          0 likes
          Last Post seqadmin  
          Working...
          X