SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
trim adapter from Illumina Genome Analyzer IIe miRNA reads NicoBxl Bioinformatics 5 01-02-2014 05:31 AM
Checking the Quality of RRBS libraries before actually running them twang11 Sample Prep / Library Generation 0 02-22-2012 04:18 PM
trim 3' adapter sequence for mRNA-Seq? slny Bioinformatics 14 06-14-2011 06:15 AM
csfasta quality hard trimming do i need to hard trim the qual file? KevinLam Bioinformatics 2 05-13-2010 02:27 PM
3' Adapter Trimming caddymob Bioinformatics 0 05-27-2009 12:53 PM

Reply
 
Thread Tools
Old 08-19-2013, 12:55 AM   #21
gerald2545
Member
 
Location: Toulouse

Join Date: Nov 2008
Posts: 21
Default

Quote:
Originally Posted by gerald2545 View Post
It seems that 0 reads were removed for R1 due to length, 0 reads were removed for R2 due to length but 2697629 pairs were removed because at least one read was shorter than the length cutoff
Is there someting that I don't well understand?
We have just installed the latest version, but did'nt try it yet (but you don't mention this in your release notes, so I don't think the behaviour will be different)
just tested with 0.3.1 version, we notice the same behaviour

Best regards

Gérald
gerald2545 is offline   Reply With Quote
Old 08-20-2013, 05:26 AM   #22
fkrueger
Senior Member
 
Location: Cambridge, UK

Join Date: Sep 2009
Posts: 589
Default

Hi Gerald,

Trim galore does not remove any reads 1 or read 2 individually if they became too short during trimming, but only does so after both reads have been trimmed (a validation step). This is done to ensure that the files do not get out of sync because of trimming.

Considering the output of Cutadapt I would guess that the 2 values are quite similar in your case, but as you said this is the output straight from cutadapt. apologies for my slow response but I am currently on holiday.

Best, Felix
fkrueger is offline   Reply With Quote
Old 08-22-2013, 11:46 PM   #23
gerald2545
Member
 
Location: Toulouse

Join Date: Nov 2008
Posts: 21
Default

Hi Felix,
no worry about the late reply, I knew that you were on holiday.

Thank you for the information. Maybe for paired-ends run, the following sentence could be omitted in the report :
"Sequences removed because they became shorter than the length cutoff of 20 bp: 0 (0.0%)"
and just write down the information for the pairs :
"Number of sequence pairs removed because at least one read was shorter than the length cutoff (20 bp): 2697629 (10.10%)"
?

have nice holidays

Gérald

PS : sorry for my late reply too, I didn't receive the notification email as my adress was wrong

Last edited by gerald2545; 08-23-2013 at 04:03 AM.
gerald2545 is offline   Reply With Quote
Old 09-03-2013, 01:17 AM   #24
ringprince
Junior Member
 
Location: Germany

Join Date: Sep 2013
Posts: 1
Default problems with --clip_R1

Hi all,

I am having issues when using the --clip_R1 option.

Code:
trim_galore --clip_R1 3 test2.fastq.gz
Gives me a lot of
Code:
substr outside of string at ../../programs/trim_galore/trim_galore line 503, <TRIM> line 43696.
substr outside of string at ../../programs/trim_galore/trim_galore line 504, <TRIM> line 43696.
Use of uninitialized value in numeric lt (<) at ../../programs/trim_galore/trim_galore line 507, <TRIM> line 43696.
Without the clipping it works fine.

I am on Linux (amd64) and I use version 0.3.1 (Last update: 18 07 2013)

Attached is a sample file that produces one of these errors for me.

Regards
Attached Files
File Type: gz test2.fastq.gz (448.3 KB, 2 views)
ringprince is offline   Reply With Quote
Old 09-03-2013, 03:07 AM   #25
fkrueger
Senior Member
 
Location: Cambridge, UK

Join Date: Sep 2009
Posts: 589
Default

Thanks for reporting this. These warnings occurred if the sequence had been adapter- or quality-trimmed below the clipping threshold. I have now added an additional check to prevent this from happening.

A new version of Trim Galore is now available from its project page (http://www.bioinformatics.babraham.a...s/trim_galore/), which also fixes one additional issue:

- Specifying --clip_R1 or --clip_R2 will no longer attempt to clip sequences hat have been adapter- or quality-trimmed below the clipping threshold
- Specifying an output directory with --rrbs mode should now correctly create temporary files
fkrueger is offline   Reply With Quote
Old 09-10-2013, 04:48 AM   #26
fkrueger
Senior Member
 
Location: Cambridge, UK

Join Date: Sep 2009
Posts: 589
Default

A quick notice that I have just put out a new version of Trim Galore (v0.3.3) that fixes a bug I had introduced accidentally last week where single-end trimming would add an empty line into the trimmed sequence output.
fkrueger is offline   Reply With Quote
Old 11-07-2013, 10:08 PM   #27
optimuscoprime
Junior Member
 
Location: Earth

Join Date: May 2013
Posts: 1
Default

If you like Trim Galore!, you might also like:

https://github.com/optimuscoprime/autoadapt

It uses FastQC to detect adaptors and primers, and then cuts them with cutadapt (well, in parallel using several cutadapts)
optimuscoprime is offline   Reply With Quote
Old 02-03-2014, 07:45 AM   #28
mvijayen
Member
 
Location: midwest

Join Date: Jun 2013
Posts: 15
Default

Hi all,

I am running into an error with trim_galore. Would anyone have any idea on what I may be doing wrong?
Command line: ./trim_galore --paired /Users/mvijayen/fastq/sample_1.fastq /Users/mvijayen/fastq/sample_2.fastq

No quality encoding type selected. Assuming that the data provided uses Sanger encoded Phred scores (default)

Writing report to 'sample_1.fastq_trimming_report.txt'

SUMMARISING RUN PARAMETERS
==========================
Input filename: /Users/mvijayen/fastq/sample_1.fastq
Quality Phred score cutoff: 20
Quality encoding type selected: ASCII+33
Adapter sequence: 'AGATCGGAAGAGC'
Maximum trimming error rate: 0.1 (default)
Minimum required adapter overlap (stringency): 1 bp
Minimum required sequence length for both reads before a sequence pair gets removed: 20 bp

Writing final adapter and quality trimmed output to sample_1_trimmed.fq


>>> Now performing quality (cutoff 20) and adapter trimming in a single pass for the adapter sequence: 'AGATCGGAAGAGC' from file /Users/mvijayen/fastq/sample_1.fastq <<<
open3: exec of cutadapt -f fastq -e 0.1 -q 20 -O 1 -a AGATCGGAAGAGC /Users/mvijayen/fastq/sample_1.fastq failed at ./trim_galore line 471

RUN STATISTICS FOR INPUT FILE: /Users/mvijayen/fastq/sample_1.fastq
=============================================
0 sequences processed in total
Illegal division by zero at ./trim_galore line 565.


I am using trim_galore version 0.3.3 and cutadapt seems to be working just fine when I check with ./cutadapt -h. Thanks!
mvijayen is offline   Reply With Quote
Old 02-03-2014, 07:58 AM   #29
fkrueger
Senior Member
 
Location: Cambridge, UK

Join Date: Sep 2009
Posts: 589
Default

If Cutadapt is not in the PATH you need to give Trim Galore the absolute path to where it can be found. Since you are saying that ./cutadapt works fine you seem to have installed it in that current directory. If you type 'cwd' and copy that path into the first part of Trim Galore that says $path_to_cutadapt = '' it should all work fine (just edit it with any editor).

Cheers,
Felix
fkrueger is offline   Reply With Quote
Old 02-03-2014, 08:05 AM   #30
mvijayen
Member
 
Location: midwest

Join Date: Jun 2013
Posts: 15
Default

Hi Felix,

I failed to mention that I actually did try that as well:

# change these paths if needed
my $path_to_cutadapt = '/Users/mvijayen/cutadapt-1.3/bin';

I am still getting the same error.
mvijayen is offline   Reply With Quote
Old 02-03-2014, 08:13 AM   #31
fkrueger
Senior Member
 
Location: Cambridge, UK

Join Date: Sep 2009
Posts: 589
Default

That path seems to be lacking /cutadapt at the end. Could you try that?
fkrueger is offline   Reply With Quote
Old 02-03-2014, 08:29 AM   #32
mvijayen
Member
 
Location: midwest

Join Date: Jun 2013
Posts: 15
Default

That worked just fine! Thanks, Felix!
mvijayen is offline   Reply With Quote
Old 05-30-2014, 01:30 AM   #33
maria.gr
Junior Member
 
Location: France

Join Date: May 2014
Posts: 2
Default

Hi guys !
I m new to trim_galore and I have a small issue introducing cutadapt in the command line of trim_galore :
Could you tell me how I should modify this line so that I introduce the cutadapt directory?

>>
./trim_galore --paired --retain_unpaired --quality 15 --fastqc_args $path_to_cutadapt='build/cutadapt' --fastqc my1.fastq.gz my2.fastq.gz
maria.gr is offline   Reply With Quote
Old 05-30-2014, 01:38 AM   #34
fkrueger
Senior Member
 
Location: Cambridge, UK

Join Date: Sep 2009
Posts: 589
Default

To supply the path to cutadapt you need to edit trim galore in a text editor and change the path as one of the first lines.
fkrueger is offline   Reply With Quote
Old 05-30-2014, 03:40 AM   #35
maria.gr
Junior Member
 
Location: France

Join Date: May 2014
Posts: 2
Default

Thanks ! Found it , but it gave again error of line 471...
Actually I realised that I had to make the file executable ('chmod a+x build/cutadapt/bin/cutadapt' )...
Don't know if it's needed for everyone after downloading the cutadapt, but I say it in case sb has the same problem
So now it runs normally !
Thanks again!
maria.gr is offline   Reply With Quote
Old 07-08-2014, 11:43 AM   #36
Kmok
Junior Member
 
Location: London

Join Date: Jul 2014
Posts: 4
Smile New Bee on Trim Galore

I use Trim Galore to trim an exome seq data captured with Illumina Nextera. The script used is
$myTrimGalore -q 15 -a CTGTCTCTTATACACATCT --stringency 3 --length 20 -e 0.1 -o $myoutDir --fastqc_args "--outdir $myoutDir" --dont_gzip --paired $myfastq1 $myfastq1.

The Fastqc results after running Trim Galore show there are bias in the nucleotides in the first 15bp (perbaseSequence). I guess this may be related to the non-random binding of transposase. There are over-representations of Kmer also at the 5' as well as in the middle of the sequence. Can anyone help in telling me what is the cause of the Kmers ( ? adapters ?indexes)? How should these be trimmed if they are adapters or indexes?

Thanks in advance

Kmer.PNG

PerBasesequence.PNG
Kmok is offline   Reply With Quote
Old 07-08-2014, 12:14 PM   #37
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,637
Default

I don't know about the later anomalies, but in my tests, Nextera seems to have highly irregular base frequencies for the first ~20bp (as you say, probably due to non-random binding). They are still fairly accurate and do not need to be trimmed.

It's possible that the later peaks are due to primer-dimers or other such artifacts. What is the insert-size distribution of the library?
Brian Bushnell is offline   Reply With Quote
Old 07-09-2014, 01:21 AM   #38
Kmok
Junior Member
 
Location: London

Join Date: Jul 2014
Posts: 4
Smile

Hi Brain

Thanks very much.
I need to ask our lab on the QC of the library. How can we guess it is dimer from the library insert distribution? Is it the a peak of same size as we seen in the later peaks?

Kin
Kmok is offline   Reply With Quote
Old 07-09-2014, 08:34 AM   #39
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,637
Default

Well... if you plot the insert size histogram, and see very sharp peaks at certain lengths, those may be some kind of non-genomic molecules. And once you know the length, you might be able to guess what they are considering all the different reagents that were used. Or you could look at reads with those specific insert sizes and see what the sequence is, to determine what they are. Once you know, you can easily filter them out (digitally). That is of course IF there are sharp peaks in the insert size histogram.

If they are non-genomic artifacts, you won't find them in the insert size histogram you would get from mapping, because they won't map. But (if you have paired reads) you can generate an insert size histogram by overlapping them with BBMerge.
Brian Bushnell is offline   Reply With Quote
Old 07-09-2014, 08:54 AM   #40
Kmok
Junior Member
 
Location: London

Join Date: Jul 2014
Posts: 4
Default

Thanks Brian
Shall try to run with BBMerge and see
Kmok is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:06 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO