SEQanswers (
-   General (
-   -   How do i trim multiple adapters from my RNAseq reads? (

LJC 05-21-2017 02:59 PM

How do i trim multiple adapters from my RNAseq reads?
Hi all,

I am using Galaxy and I want to remove the universal adapters as well as the index adapters in each data file of my RNAseq data. I know you can can specify exactly what sequence to remove by pasting in the sequence for the adapter sequence to be trimmed off in Trim Galore!. However, i would like to specify multiple sequences at the same time and i cant see an option for this in Trim Galore!

I can see that you are able to trim multiple sequences in Trimmomatic by uploading a Fasta of adapters to clip. I pasted in the universal adapter sequence and all of the possible index adapter sequences into Notepad each with a line separating them, i saved this file as a .txt file, uploaded it to galaxy and under Datatype changed it to Fasta. I then used this file in Trimmomatic under 'Fasta of adapters to clip' but it didn't work (i,e. the adapters weren't trimmed off). I also tried putting a '>' in front of each of the sequences in notepad as i read online somewhere to do this. However this also didn't work. I was wondering if anyone could tell me where i was going wrong?

Thanks so much!

fkrueger 05-22-2017 05:20 AM

In the vast majority of cases when people want to remove multiple different adapter it turns out that they do not actually want to do that. If you ran standard sequencing (e.g. TruSeq, Sanger iTag etc.) all of the sequences share the first 13bp of the standard adapter sequence 'AGATCGGAAGAGC' , and only diverge after this point. This is also true for different indexes used in the adapters. Thus, running Trim Galore in its default mode is just the right thing to do, and there is no need specify a long list of all different index options. Good luck! Felix

LJC 05-22-2017 02:38 PM

Thanks for your reply Felix.

You're right, it is standard Illumina RNAseq. When i run the default Trimming, as you suggest, my FastQC Adapter plot flatlines, which is great. However, i then get overrepresented sequences that match to specific Index adapters. So i'm not entirely sure what to do. That is why i thought i should specify all of Illumina's Index adapters to trim. I did actually manage to work out how to do this with a fasta file, i also included the 'universal' sequence in the fasta file. After trimming, my FastQC plots showed me that there were no more overrepresented sequences, however then my Adapter plot rose at the end to show that there is still some 'Universal adapter' contamination. I cant seem to figure out how to get rid of the 'universal' adapter contamination and the overrepresented sequences that match to index adapters, all at the same time!

Do you think i can Trim twice? First to get rid of the universal adapter contamination and then again specifying all of the Index Adapter sequences?


fkrueger 05-22-2017 02:45 PM

The kind of adapter contamination you want to get rid off is the read-through kind, where you get a piece of fragment that then continues to read into the adapter. These are all taken care of by Trim Galore.

What you sometimes see flagging up as overrepresented sequence is probably something like adapter dimers or concatamers. These are contaminants that, but since they are purely adapter sequence they won't align to the genome anyway and are hence taken care of in the subsequent alignment step. If you look at the sequence of those contaminants you will probably notice that they don't look like the sequence I linked in the thread above, often they are simply lacking the A at the start (from the A-tailing process). In other words, I would recommend you run the adapter trimming as outlined already, and don't bother about additional contaminants as they won't align anyways.

All the best, Felix

LJC 05-22-2017 02:49 PM

Ok, will do. Thanks so much for your help.

nucacidhunter 05-22-2017 04:10 PM

I think your library has had more than usual amount of adapter-dimers which is not removed with one final clean up after PCR. The reasons could be:

1- Low quality of input RNA
2- Low quantity of input
3- Sub-optimal library prep

If you look at libraries profile you should see a small peak around 150-160 bp representing dimers. Number of over-represented adapters should correlate to the molar quantity of 150 bp peak in each library. As fkruger has mentioned they will not align to genome.

LJC 05-22-2017 04:15 PM

Thanks nucacidhunter. There was low quantity of RNA for this sample.
However, i am not aligning these sequences to a genome as i will be doing de novo transcriptome assembly. Does this still mean i can ignore these overrepresented adapters? Or should i try to remove them?


nucacidhunter 05-22-2017 04:44 PM

If you have a reference genome you can use only aligned reads for assembly. Otherwise, you should be able to remove adapters after assembly as they should assemble together.

All times are GMT -8. The time now is 07:07 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.