Seqanswers Leaderboard Ad

**mastal** · 09-16-2013, 04:29 AM

If you use trimmomatic to trim your paired-end data, it will give you
separate files with the reads that end up unpaired after trimming.

USADELLAB.org - Trimmomatic: A flexible read trimming tool for Illumina NGS data

http://www.usadellab.org/cms/?page=trimmomatic

**zhoujiayi** · 09-16-2013, 04:40 AM

Originally posted by mastal View Post

If you use trimmomatic to trim your paired-end data, it will give you
separate files with the reads that end up unpaired after trimming.

http://www.usadellab.org/cms/?page=trimmomatic

So you mean I can use trimmomatic to trim my paired-end read 1 file and read 2 file. Then I will get two trimmed files but they are not paired so that I can run the bowtie alignment on each them?

By the way, trimmomatic can also output paired files after trimming, are the number of reads in the paired output files the same?
Thank you.

**mastal** · 09-16-2013, 05:02 AM

trimmomatic will give you 4 output files, in fastq format:
R1_paired, R1_unpaired, R2_paired, and R2_unpaired.

R1_paired and R2_paired will have the same number of reads,
in the same order, just like the untrimmed Illumina data, except that
the reads where R1 or R2 was removed by the trimming process will be removed from both files.

Bowtie doesn't do mixtures of paired and unpaired reads, so you will
have to run the R1_paired, R2_paired as one run, and the unpaired files as a separate run.

Hope this makes sense.
Maria

**zhoujiayi** · 09-16-2013, 05:32 AM

Originally posted by mastal View Post

trimmomatic will give you 4 output files, in fastq format:
R1_paired, R1_unpaired, R2_paired, and R2_unpaired.

R1_paired and R2_paired will have the same number of reads,
in the same order, just like the untrimmed Illumina data, except that
the reads where R1 or R2 was removed by the trimming process will be removed from both files.

Bowtie doesn't do mixtures of paired and unpaired reads, so you will
have to run the R1_paired, R2_paired as one run, and the unpaired files as a separate run.

Hope this makes sense.
Maria

Thank you for your soonest reply.
By the way, can I consider that it is better to use trimmed paired files to do the alignment when your raw data files are paired-end? Then what is the point to do the alignment for trimmed unpaired files while the raw data files are paired-end?

**mastal** · 09-16-2013, 05:36 AM

It doesn't matter whether your data is single-end or paired-end, it is always better to do QC first, and then trim the reads if the QC indicates that you have low quality regions or adapter sequences.

**zhoujiayi** · 09-16-2013, 05:56 AM

Originally posted by mastal View Post

It doesn't matter whether your data is single-end or paired-end, it is always better to do QC first, and then trim the reads if the QC indicates that you have low quality regions or adapter sequences.

Sorry for my poor English. I guess I didn't make my point clearly.
I know it is always better to do QC first.
For example:
I have two fastq files (R1.fastq R2.fastq), which are paried-end data.
After I use Trimmomatic to do the trimming, I can get R1_trimmed_paired.fastq,R1_trimmed_unpaired.fastq, R2_trimmed_paired.fastq,R2_trimmed_unpaired.fastq.
Then,
1. I can run bowtie with R1_trimmed_paired.fastq and R2_trimmed_paired.fastq as paired-end data to get the alignment file say R1R2.sam.
2. Or I can run bowtie with R1_trimmed_unpaired.fastq or R2_trimmed_unpaired.fastq seperately to get two alignment files say R1.sam or R2.sam.

As my understanding, it make sense for me to do the above step 1, because we are processing paired-end files. Then I am wondering why we can do step 2? Step 2 seems to process the paired-end files as single-end files, if we can do that, why don't we just treat all the files as single-end and process them?

**dpryan** · 09-16-2013, 06:26 AM

Originally posted by zhoujiayi View Post

Sorry for my poor English. I guess I didn't make my point clearly.
I know it is always better to do QC first.
For example:
I have two fastq files (R1.fastq R2.fastq), which are paried-end data.
After I use Trimmomatic to do the trimming, I can get R1_trimmed_paired.fastq,R1_trimmed_unpaired.fastq, R2_trimmed_paired.fastq,R2_trimmed_unpaired.fastq.
Then,
1. I can run bowtie with R1_trimmed_paired.fastq and R2_trimmed_paired.fastq as paired-end data to get the alignment file say R1R2.sam.
2. Or I can run bowtie with R1_trimmed_unpaired.fastq or R2_trimmed_unpaired.fastq seperately to get two alignment files say R1.sam or R2.sam.

As my understanding, it make sense for me to do the above step 1, because we are processing paired-end files. Then I am wondering why we can do step 2? Step 2 seems to process the paired-end files as single-end files, if we can do that, why don't we just treat all the files as single-end and process them?

Doing 1. makes total sense and does what you describe. Doing 2. may or may not be worthwhile (in my experience, at least, aligning an R2_trimmed_unpaired file is usually not worthwhile). The reads in the unpaired files are not the same as those in the paired file. In brief, if one read of a pair has terrible quality, is mostly adapter, or something else that results in it being trimmed to short for use, then its mate is written to the appropriate unpaired file. These, then are single-end reads, because their mates aren't useful for anything. In general, paired-end reads will give you a little more certain alignment (they can also more easily be used for determining structural variations and other things, if that's your goal).

**mastal** · 09-16-2013, 06:27 AM

Because there are advantages to using paired-end reads.

When you are doing alignment or assembly, it is easier to map the reads correctly if you know that R2 should map within so many bases from R1.

**arcolombo698** · 11-29-2013, 05:31 PM

Memory Space issues and Unpaired Reads

Hello.

I finished trimming my data and also have paired end reads and unpaired ended reads.

I have limited space and want to delete the unpaired reads. In order to be sure I do not need the unpaired data, if I did a FastQC report on the trimmed paired data, will this suffice in letting me delete the unpaired data if I know that the paired reads that are trimmed have good quality?

thank you

**mastal** · 11-30-2013, 10:00 AM

I guess it depends how many of your reads are paired-end and how many are single-end after trimming.

I would also run FastQC on the single-end reads, to see how the quality compares with that of the trimmed paired-end reads. Then decide whether you want to delete them or not.

**arcolombo698** · 11-30-2013, 01:40 PM

Deleting the Unpaired Reads

Hello. Thank you for your reply.

I may not have time to compare each paired trimmed and unpaired trim for each sample. I have too many.

So if my paired Trimmed data passes the FastQC, it would make sense to use only the paired end data. Comparing is not time efficient.

Especially if a lot of folks are writing

"Doing 1. makes total sense and does what you describe. Doing 2. may or may not be worthwhile (in my experience, at least, aligning an R2_trimmed_unpaired file is usually not worthwhile). "

Thank you.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 25 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 28 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 24 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 52 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Trim the paired-end data

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News