SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Can somebody explain the purpose of Y adapters for paired end preps to me? Heisman Illumina/Solexa 12 02-24-2016 06:04 AM
Removing unpaired read from HiSeq paired end fastq file SeqTrbl Bioinformatics 2 01-02-2014 05:49 AM
Removing short reads from paired-end fastqs jakeenk Bioinformatics 0 07-11-2013 09:00 AM
Do we need two adapters to generate Pair-end Reads ? byou678 Bioinformatics 4 02-07-2013 09:05 AM
paired-end reads mapped to genome.. gene with only one direction of paired-end reads? danwiththeplan Bioinformatics 2 09-22-2011 03:06 AM

Reply
 
Thread Tools
Old 01-13-2014, 11:22 AM   #1
prs321
Member
 
Location: US

Join Date: Jun 2013
Posts: 96
Default What do I do with my paired end reads after removing the adapters?

I'm not sure if I am doing this right...

I have paired end reads of Serratia m.

Here are the steps I took so far:

1. FASTQC report for every reads. Check to see if adapters are a source of contamination. I checked "Overrepresented Sequences" in order to see if there was an adapter or not. If there were no sequences that weren't labeled "No Hit", I leave the read alone.

2. I used cutadapt to cut adapters, which left me with 2 of my adapter cut paired end reads as well as a file for single end reads.

I did the same for scythe, except scythe didn't produce a single end reads file.

3. I noticed that the pair end files were not organized properly by the header, so I made a script to correct this. My script takes 2 paired end reads and gives you an output of the 2 organized paired end reads file with a file containing all the single end reads that did not have a pair.



What do I do now?

I'm confused on what I should do with my single end reads obtained after using cutadapt and I am also confused on what I should do with my single reads obtained after using my script to organize my fastq files by the header.

When I move onto the trimming stage, do I ONLY trim my paired end reads and just ignore the single end reads? Or do I trim my paired end reads as well as my single end reads?

When I am looking for the snps of 1 replicate, do I map the paired end reads as well as any other single end reads onto the reference genome?


Edit:

TL;DR

Are these the right steps to get in order to start mapping my reads?

1. Cut adapters (gives SE file)

2. Quality Trim (gives SE file)

3. Organize pairs (gives SE file)

So by the end of this whole process I am left with 3 SE files and 2 processed paired end reads, giving a total of 5 files.

Do I need to do any quality trimming to the single end reads or do I just take all 5 of my files and map them to the reference?

Last edited by prs321; 01-13-2014 at 11:45 AM.
prs321 is offline   Reply With Quote
Old 01-13-2014, 04:23 PM   #2
mastal
Senior Member
 
Location: uk

Join Date: Mar 2009
Posts: 667
Default

You could try using trimmomatic, which will do adapter trimming and quality trimming, and give you 4 output files ( 2 for paired reads, 2 for single reads), where the 2 paired read files are in the same order.

What you do with the trimmed data afterwards depends in part on what software you choose to align/assemble the data. For example, some aligners will only use paired reads or single reads, but not a mixture of both, so you would have to run the paired reads and the single reads separately.
mastal is offline   Reply With Quote
Old 01-13-2014, 09:02 PM   #3
relipmoc
Member
 
Location: Los Angeles, CA

Join Date: Jul 2011
Posts: 58
Default skewer

You may also try using skewer, which is dedicated to adapter trimming for paired-end reads. Visit http://sourceforge.net/projects/skewer/ for downloading.
relipmoc is offline   Reply With Quote
Old 01-14-2014, 05:18 AM   #4
TiborNagy
Senior Member
 
Location: Budapest

Join Date: Mar 2010
Posts: 329
Default

Quote:
Originally Posted by prs321 View Post
When I am looking for the snps of 1 replicate, do I map the paired end reads as well as any other single end reads onto the reference genome?
Yes, you need to map reads to the reference genome and do variant calling to find SNPs.
TiborNagy is offline   Reply With Quote
Old 01-14-2014, 12:20 PM   #5
dGho
Member
 
Location: Rochester, NY

Join Date: Jan 2013
Posts: 43
Default

yes, you will map the paired ends. I have read some forums suggesting that you also map the singletons separately (where one read in a pair did not pass qc filters) but I personally have never done this. In our lab and our collaborating labs we just deal with the pairs and have been getting good results. but I guess it's up to you.

So you will map the cut/trimmed/organized paired end fastqs (you will have them in two separate files, one for each direction) using the mapping software of your choice. Once the paired end fastqs are mapped you will have ONE sam file that combined the mapping of the paired reads. You will continue on to the rest of your pipeline and variant calling using that one resulting sam file. What you do with the singleton reads is up to you. good luck!
dGho is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 07:19 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO