Seqanswers Leaderboard Ad

**Brian Bushnell** · 05-01-2014, 12:36 PM

Originally posted by leejimmy93 View Post

Hi,

I have 100bp PE Illumina sequencing data, I used CLC genomic tool and trimmomatic did trimming respectively, but after trimming, fastqc still show warning signals for my trimmed data, especially for data trimmed by CLC. In addition, with original trimmomatic parameters, I got very low percent trimmed data out of raw reads, around 30% for paired and 30% for unpaired.

I understand that fastqc is used to check how normal the sequencing data are, one of the reason why my data is not normally distributed might because mine is RNA. But by comparing CLC and trimmomatic trimmed data, I found trimmomatic trimmed data' per base GC content and per base sequence content are normal as checked by fastqc, while CLC doesn't give good performance on those two criteria.

So can anyone here give me any suggestions and advise?
Dose RNA data really need to be pass all the criteria in fastqc?
Thanks advance!

Apparently RNA-seq data often gets low grades from fastqc because of non-random primers; please look at this thread, starting at the linked post (#257).

That said, if the problem actually is adapters not being removed, then I suggest you use BBDuk. As the link indicates, in my testing, it does a much better job of removing adapters than cutadapt or trimmomatic.

There's also the possibility that you are using the wrong adapter sequence for trimming, so double-check that.

P.S. Also, be sure to filter out primer-dimers and other common Illumina artifacts, which can bias base-composition ratios.

**Oyster_lab** · 06-24-2014, 06:09 PM

Hi!
Following up on this question, should I remove adapters from both strands?
I am using CLC for read trimming, and have paired end 100 bp Illumina reads.

**Brian Bushnell** · 06-24-2014, 07:03 PM

Oyster,

What can you tell me about your data? For example, the insert-size distribution and quality profile, or read length, platform, coverage, etc.

If you download the BBTools package, the insert-size distribution can be obtained like this (assuming interleaved reads):

(if you have a reference)
bbmap.sh in=reads.fq ihist=ihist.txt

(If you don't have a reference and the insert size is short)
bbmerge.sh in=reads.fq hist=hist.txt

With non-interleaved reads, you have to specify "in1" and "in2".

With normal fragment libraries, read1 and read2 will have the adapter at the same place. You can tell BBDuk to remove these with the "tpe" (trim both evenly) and "tbo" (trim by overlap) flags, even if you don't know the adapter sequence, but it certainly is much better if you do know it. My suggesated command line:

bbduk.sh in=reads.fq out=clean.fq k=25 mink=12 tbo tpe hdist=1 ref=truseq.fa

-Brian

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, Yesterday, 11:49 AM	0 responses 15 views 0 likes	Last Post by seqadmin Yesterday, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 61 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

RNA seq trimming

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News