Go Back   SEQanswers > Applications Forums > RNA Sequencing

Similar Threads
Thread Thread Starter Forum Replies Last Post
cutadapt trimming for multiple files lianov Bioinformatics 13 11-14-2015 02:11 AM
Adapter trimming with cutadapt Kulvait Bioinformatics 3 05-05-2015 03:48 AM
cutadapt quality trimming cutoff algorithm bongbimit Bioinformatics 1 08-13-2014 08:36 AM
3' trimming using cutadapt changes property of 5'. Alex Lee RNA Sequencing 0 04-17-2014 10:28 PM
Adapters trimming: Cutadapt vs Trimmomatic MafaldaSF Bioinformatics 8 03-20-2014 06:16 AM

Thread Tools
Old 07-29-2016, 02:29 AM   #1
Junior Member
Location: Barcelona

Join Date: Mar 2016
Posts: 4
Default Trimming adapters with Cutadapt

Hi everyone, I'm having some problems trying to figure out what sequence of adapter should I enter as input in cutadapt or trimmomatic to trim them from my fastqs.

I have a set of fastqs, each of them with a set of reads of 51 bp, comencing with an N and then a series of letters corresponding to the read. I have also the information about the index sequence in each fastq, after demultiplexing, and two sequences determining the primers used. For instance, this is the information about one fastqc I have:

@700470R:449:HVHH7BCXX:2:1101:1406:1948 1:N:0:GTGAAA
@700470R:449:HVHH7BCXX:2:1101:1814:1992 1:N:0:GTGAAA
@700470R:449:HVHH7BCXX:2:1101:2184:1885 1:N:0:GTGAAA
The index sequence is, as determined in the header, GTGAAA. I also have information about the SR primer, which is:


and the Index primer, which is:


Substituting the NNNNNN with the index sequence provided in the header of the corresponding fastq, I would obtain the barcoded adapter used for sequencing, if I'm not wrong.

So here is where I start getting lost. After doing fastqc analysis, I got a list with a bunch of sequences in the overrepresented sequences, corresponding to Illumina Multiplexing PCR primer, as if there were different adapters withing the whole fastq in the same file.

So, here is my question:

¿What sequence should I include in cutadapt program to trim in this case, for instance? Should I include more than one? In my oppinion I should include the Index primer sequence substituting the NNNNNN with the index sequence (barcode), for each fastq, but I'm not sure whether this is correct or not, and whether I should include more sequences or not. Also I'm not sure about what parameters I should include to run cutadapt. I assume that I should add the variables -a and -g to include the adaptor sequence in both sides to be trimmed, or if just adding -a would work. Also wondering about Error Tolerance (-e) in matching letters in adapters (don't know what by default value is included if no specification is added). Also wondering about using Wildcards NNNNN as universal adapter or just creating a list for each barcode used in each sample fastq to be included as adapter variable. Also wondering if using Quality trimming would be usefull, although the average quality base call in each read is very high (over 30). And also wondering if ussing --trim-n option to trim possible flanking Ns in my reads...

As you all see... quite lost I am...
Elfangor is offline   Reply With Quote
Old 08-03-2016, 09:05 AM   #2
Location: Russia

Join Date: Jul 2014
Posts: 18


In my experience, using trimmomatic, you can use the information about your platform to remove universal adapters from your reads, no need to know the exact index sequence.

You can find these universal adapters as part of the trimmomatic package, or can be downloaded from here. Note that the adapter file to be specified in your trimming procedure depends on a combination of platform and nature of sequencing reads (paired/single end). You can find Trimmomatic usage info here. It's very clearly explained and quite self-explanatory, but write back here if you still have issues.
dovah is offline   Reply With Quote
Old 08-03-2016, 11:22 AM   #3
Location: Virginia

Join Date: May 2016
Posts: 80

Also consider doing a quality analysis of your fastq files before doing any trimming or proceeding in the pipeline. Use FastQC for the quality analysis and then use Trim_galore to trim the reads of adaptors in addition to the general quality improvement of the reads.
ronaldrcutler is offline   Reply With Quote
Old 08-10-2016, 11:26 AM   #4
Senior Member
Location: Bioo Scientific, Austin, TX, USA

Join Date: Jun 2012
Posts: 119

I think this is what you need:

cutadapt -a AGATCGGAAGAG -o YOUR_FILE.trim1.fq --minimum-length 15 YOUR_FILE.fastq.gz

You don't need to put in the index sequence, as cutadapt will remove anything 3' of the adapter sequence, unless you specify otherwise. The minimum length command will throw out any reads less than the specified value. I think the default allowed error rate is 0.1, which should be fine.

It does look like you can use the --trim-n option to remove the first N.

It probably isn't necessary to quality trim, although you may want to quality filter before the adapter trimming. Also probably no need for the -g command, unless this was a particular kind of library where you expect to see adapter sequence at the 5' end of the read.
kerplunk412 is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 09:25 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO