![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Quality-, adapter- and RRBS-trimming with Trim Galore! | fkrueger | Bioinformatics | 136 | 01-18-2019 02:24 AM |
Adapter trimming | figo1019 | RNA Sequencing | 2 | 07-17-2018 05:00 AM |
Adapter trimming and trimming by quality question | alisrpp | Bioinformatics | 5 | 04-08-2013 05:55 PM |
adapter trimming - help | a_mt | Bioinformatics | 6 | 11-12-2012 08:36 PM |
3' Adapter Trimming | caddymob | Bioinformatics | 0 | 05-27-2009 01:53 PM |
![]() |
|
Thread Tools |
![]() |
#301 | |
Member
Location: Thessaloniki, Greece Join Date: Jul 2018
Posts: 12
|
![]() Quote:
From JGI site: "SplitNextera splits Nextera LMP libraries into subsets based on linker orientation. It is designed strictly for Nextera LMP (long-mate-pair) reads, not for normal libraries using a Nextera kit. Nextera LMP libraries must be split prior to further processing; they are not usable raw. Adapter-trimming should still be done on Nextera LMP libraries prior to splitting." Mate SamplePrep V2 Documentation: https://support.illumina.com/content...15008135_A.pdf |
|
![]() |
![]() |
![]() |
#302 |
Junior Member
Location: USA Join Date: Jul 2018
Posts: 2
|
![]()
hi. i am new here how are you all.
|
![]() |
![]() |
![]() |
#303 |
Junior Member
Location: Santa Fe, New Mexico Join Date: Jul 2018
Posts: 2
|
![]()
Hi All,
Is it possible to match degenerate sequences like below, trim the sequences and place the degenerate sequences in the fastq header? I am attempting to trim an adapter with the following structure Adapter(21nt)-UMI(16nts)-Adapter(24nt) and place it in the fastq header. Matching degenerate sequences such as primers: bbduk.sh in=reads.fq out=matching.fq literal=ACGTTNNNNNGTC copyundefined k=13 mm=f Thank you for your help! |
![]() |
![]() |
![]() |
#305 |
Junior Member
Location: Santa Fe, New Mexico Join Date: Jul 2018
Posts: 2
|
![]()
Hi GenoMax,
Thanks for the suggestion! I have used UMI tools which works ok but I am working with long reads with a higher error rate (indel bias) than Illumina reads. Therefore, it is likely that the adapter and UMI will contain indels so the adapter structure may actually look like this, Adapter(19-21nt)-UMI(14-16nts)-Adapter(22-24nt). Thanks again for your advice! |
![]() |
![]() |
![]() |
#306 |
Junior Member
Location: USA Join Date: Oct 2018
Posts: 2
|
![]()
Hi all,
Newcomer to RNA-seq/bbduk/the forum here.. I have a question that's probably really basic, but I have read through the bbduk docs, ctrl+F'ed "maq" and "minavgquality" through all 16 pages of this thread, and tried googling; all to no avail. So here I am. When using `minavgquality` (`maq`), I'm very puzzled as to how the "average quality" is calculated. I was filtering full-length reads (91bp) by average quality (no trimming involved), and was expecting a very straightforward calculation -- taking the unweighted mean of individual Phred scores across individual bases. For example, with @A00325:34:H3FM7DRXX:1:1101:1208:1047 2:N:0:0NACTCTAA AGTCGTACCGGAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACAAAGAAAAGTAAACTGCGTTTATACCAATGCGTCCGCGGACAGGCGTTT + FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF,,F,,F,F,,,F,,:F,,,FF:F,F,:,,,,,,,FF:F::,,F:::F,:F There are 25 `,`, 10 `:`, and 56 `F`. Under Illumina 1.8+ encoding scheme, I was expecting something like (25*(44-33)+10*(58-33)+56*(70-33))/(25+10+56)=2597/91=28.5. I was shocked when this read got filtered with an `maq` of 20. I looked through some of the source code mentioning `minAvgQuality`. In BBQC.java and RQCFilter2.java, the default `minAvgQuailty` settings seem to be 8 and 5 respectively. This, plus the fact that when I tried `maq=30` all (!) my reads were filtered, made me suspect that bbduk calculates "average quality" differently somehow? Can someone please explain this? (Is this what the "Phred algorithm" alluded to by the bbduk doc is referring to?) (I read here during my googling attempt that "Calculating average Q (Phred) scores is a bad idea". But it's something that our lab routinely does and I think my PI would want me to do it anyways..) Command I was using (version 38.25): bbduk.sh in=raw.fastq out=raw_qual-pass.fastq outm=raw_qual-fail.fastq maq=20 ordered=t (also tried adding `k=91` since all my reads are 91bp, `qin=33`, `qout=33`; no difference whatsoever) Thanks! |
![]() |
![]() |
![]() |
#307 | ||
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 6,856
|
![]()
@FlySquirrelFly: It has become difficult to get a hold of Brian (due to his day job responsibilities) but I will flag your post for him to see if he can respond.
I recall from some past discussion that average quality is calculated as a rolling window average and as soon as it drops below your set value it will trim/filter the rest of the read. You should also consider this: Quote:
Quote:
Last edited by GenoMax; 10-04-2018 at 05:44 AM. |
||
![]() |
![]() |
![]() |
#308 |
Junior Member
Location: USA Join Date: Oct 2018
Posts: 2
|
![]()
@GenoMax:
Thanks for your quick reply and for flagging the post for Brian! Much appreciated. Indeed, since I did not set ktrim or kmask, kfilter should have been carried out (which is what I intended). I wanted to filter based on the average quality of the full-length read, so I did not use the options related to and including`qtrim`. I'm doing two separate analyses. One involves the canonical type of transcriptomic analysis (quantification of gene expression, differential expression analysis, etc). For that, like you said, there's probably no need to filter based on quality. The other involves doing some de novo assembly using the raw reads (for antibody V(D)J receptor). I figured that for the latter it'd probably be nice to have an extra layer of QC. |
![]() |
![]() |
![]() |
#309 |
Senior Member
Location: Cambridge, UK Join Date: May 2010
Posts: 311
|
![]()
Hi- Is bzip2 input supported by `bbduk.sh`? When I try it, bbduk seems to hang as below. (It would be good to have support for bzip2)
Thanks! Code:
bbduk.sh in=/scratch/dberaldi/Texas_Biobank/TCRBOA1-N-WEX.read1.fastq.bz2 out=stdout.fq java -Djava.library.path=/home/db291g/test-setup-travis/downloads/bbmap/jni/ -ea -Xmx14666m -Xms14666m -cp /home/db291g/test-setup-travis/downloads/bbmap/current/ jgi.BBDukF in=/scratch/dberaldi/Texas_Biobank/TCRBOA1-N-WEX.read1.fastq.bz2 out=stdout.fq Executing jgi.BBDukF [in=/scratch/dberaldi/Texas_Biobank/TCRBOA1-N-WEX.read1.fastq.bz2, out=stdout.fq] Version 37.98 [in=/scratch/dberaldi/Texas_Biobank/TCRBOA1-N-WEX.read1.fastq.bz2, out=stdout.fq] NOTE: No reference files specified, no trimming mode, no min avg quality, no histograms - read sequences will not be changed. 0.028 seconds. Initial: Memory: max=14737m, free=14276m, used=461m |
![]() |
![]() |
![]() |
#310 |
Junior Member
Location: Laguna Philippines Join Date: Dec 2018
Posts: 1
|
![]()
someone who knows how to fix this?
I entered this command: $ ./bbduk.sh -Xmx27g in1=~/NGS\ 10273-Raw\ Data\ NO.rep.1/NO.rep.1_1.fq in2=~/NGS\ 10273-Raw\ Data\ NO.rep.1/NO.rep.1_2.fq out1=adapter_trimmed1.fq out2=adapter_trimmed2.fq ref=~/adapters.fa ktrim=r k=23 mink=11 hdist=1 tpe tbo qtrim=rl trimq=10 minlen=36 mag=10 bhist=bhist.txt qhist=qhist.txt gchist=gchist.txt aqhist=aqhist.txt lhist=lhist.txt gcbins=auto ..then this came up java -ea -Xmx27g -Xms27g -cp /Users/uplb/Documents/AGC/bbmap/current/ jgi.BBDuk -Xmx27g in1=/Users/uplb/NGS 10273-Raw Data NO.rep.1/NO.rep.1_1.fq in2=/Users/uplb/NGS 10273-Raw Data NO.rep.1/NO.rep.1_2.fq out1=adapter_trimmed1.fq out2=adapter_trimmed2.fq ref=/Users/uplb/adapters.fa ktrim=r k=23 mink=11 hdist=1 tpe tbo qtrim=rl trimq=10 minlen=36 mag=10 bhist=bhist.txt qhist=qhist.txt gchist=gchist.txt aqhist=aqhist.txt lhist=lhist.txt gcbins=auto Executing jgi.BBDuk [-Xmx27g, in1=/Users/uplb/NGS, 10273-Raw, Data, NO.rep.1/NO.rep.1_1.fq, in2=/Users/uplb/NGS, 10273-Raw, Data, NO.rep.1/NO.rep.1_2.fq, out1=adapter_trimmed1.fq, out2=adapter_trimmed2.fq, ref=/Users/uplb/adapters.fa, ktrim=r, k=23, mink=11, hdist=1, tpe, tbo, qtrim=rl, trimq=10, minlen=36, mag=10, bhist=bhist.txt, qhist=qhist.txt, gchist=gchist.txt, aqhist=aqhist.txt, lhist=lhist.txt, gcbins=auto] Version 38.33 Exception in thread "main" java.lang.RuntimeException: Unknown parameter 10273-Raw at jgi.BBDuk.<init>(BBDuk.java:513) at jgi.BBDuk.main(BBDuk.java:76) |
![]() |
![]() |
![]() |
#311 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 6,856
|
![]()
@enma_ai: It is a bad idea to have spaces in your file/directory names (even though your OS may allow it). I suggest the you change "NGS 10273-Raw" to "NGS_10273-Raw" and see if that fixes the error you see.
|
![]() |
![]() |
![]() |
#312 |
Junior Member
Location: New Zealand Join Date: Jul 2012
Posts: 8
|
![]()
Hi guys,
I have been playing with bbduk and the ref option to submit a list of adapters/contaminants. I found that the order of the adapters in the ref matters and depending on which adapter is first in this file bbduk might take one over the other. e.g. I changed the order and ended up having lots of "Bisulfite_R1" trimmed, but when the "Reverse_adapter" was first in the ref file it was used instead much more often on the same fq file. >Bisulfite_R1 AGATCGGAAGAGCACACGTCTGAAC >Reverse_adapter AGATCGGAAGAGCACACGTCTGAACTCCAGTCACATCACGATCTCGTATGCCGTCTTCTGCTTG Is this an intended behavior? It seems to me that this should not be the case. Cheers, Seb |
![]() |
![]() |
![]() |
#313 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 6,856
|
![]()
@Seb: Brian does not seem to have time to respond to questions in this forum now a days but the different length of the adapters may have some bearing on this. Since the part you are looking for is identical (in bold) there is no need to add both copies. I assume you are trimming away sequence to the right(?) once that part in bold is located?
|
![]() |
![]() |
![]() |
#314 |
Junior Member
Location: Málaga Join Date: Jan 2019
Posts: 4
|
![]()
Hi all,
I'm trying to run bbduk.sh on mac os High Sierra 10.13.1, and it is stuck with some issue with classpath. I've already included the path to the bbduk.sh file in the path and the classpath, but still not working. This is the input in terminal: Sisi$ bbduk.sh -Xmx24g in=B1_sub_R1.fq in2=B1_sub_R2.fq \ out=B1_sub_R1_trimmed.fq.gz out2=B1_sub_R2_trimmed.fq.gz \ literal=GTGCCAGCMGCCGCGGTAA,GGACTACHVGGGTWTCTAAT k=10 ordered=t mink=2 \ ktrim=l rcomp=f minlength=220 maxlength=280 tbo tpe And this is the output: java -ea -Xmx24g -Xms24g -cp /usr/local/bin/current/ jgi.BBDukF -Xmx24g in=B1_sub_R1.fq in2=B1_sub_R2.fq out=B1_sub_R1_trimmed.fq.gz out2=B1_sub_R2_trimmed.fq.gz literal=GTGCCAGCMGCCGCGGTAA,GGACTACHVGGGTWTCTAAT k=10 ordered=t mink=2 ktrim=l rcomp=f minlength=220 maxlength=280 tbo tpe Error: Could not find or load main class jgi.BBDukF Caused by: java.lang.ClassNotFoundException: jgi.BBDukF by the way, I couldn't find BBDuckF file. Any idea?? Thanks a lot, |
![]() |
![]() |
![]() |
#315 |
Registered Vendor
Location: Eugene, OR Join Date: May 2013
Posts: 464
|
![]()
What's in your /bbmap/current/ directory? I have /bbmap/current/jgi/BBDukF.class
__________________
Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com |
![]() |
![]() |
![]() |
#316 |
Junior Member
Location: Málaga Join Date: Jan 2019
Posts: 4
|
![]()
Hi, sorry, I come back with this. I have:
/Users/owner/bbmap/current/jgi/BBDuk.class /Users/owner/bbmap/current/jgi/BBDuk2.class but any file with the name BBDukF.class The bbduk.sh, which is in /Users/owner/bbmap/bbduk.sh has this options at the end: fi local CMD="java -Djava.library.path=$NATIVELIBDIR $EA $z $z2 -cp $CP jgi.BBDuk $@" local CMD="java $EA $z $z2 -cp $CP jgi.BBDuk $@" if [[ $silent == 0 ]] && [[ $json == 0 ]]; then echo $CMD >&2 I don't know which should be active and which not. Perhaps that's the problem. Any idea? Thanks |
![]() |
![]() |
![]() |
#317 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 6,856
|
![]()
I am not sure why you are having this problem. All you should need to do is download BBMap software on your mac. Unarchive the tar-zipped file and then extend your path to include the "bbmap" directory (export PATH=$PATH:/path_to_bbmap_dir). Don't move contents of the bbmap directory. Move the entire directory to whatever location you want and then amend $PATH.
|
![]() |
![]() |
![]() |
#318 |
Junior Member
Location: Málaga Join Date: Jan 2019
Posts: 4
|
![]()
Yes, that is what I've done.
the path to bbmap is --> /Users/owner/bbmap and it is included in the $PATH I think that the problem is related to the way the script searches the classes. None of sh files manage to find the path. As example, executing bbduk.sh retrieves: java -ea -Xmx24g -Xms24g -cp /usr/local/bin/current/ jgi.BBDuk ...... Error: Could not find or load main class jgi.BBDuk Caused by: java.lang.ClassNotFoundException: jgi.BBDuk I've checked and I don't have any current dir on bin, so I don't know how to tell the script to go directly to the bbmap dir. Any idea is greatly appreciated, I stuck with this. |
![]() |
![]() |
![]() |
#319 |
Senior Member
Location: East Coast USA Join Date: Feb 2008
Posts: 6,856
|
![]()
On a Mac I tested this on nothing else was needed to be done. What happens if you just run "bbmap.sh". Does that produce "in-line" bbmap help output?
Which Java version are you using on your Mac? |
![]() |
![]() |
![]() |
#320 |
Junior Member
Location: Málaga Join Date: Jan 2019
Posts: 4
|
![]()
Thanks for the patience, answering your questions:
1. If I run "bbmap.sh" or "bbduck.sh" for instance the content of the file is shown (parameters, flags, etc). If I run the command with the parameters, like: $ bbduk.sh in=sample14_S14_R1.fastq.gz in2=sample14_S14_R2.fastq.gz out=sample14_S14_R1_btrimmed.fastq.gz out2=sample14_S14_R1_btrimmed.fastq.gz literal=GTACACAMCGCCCGTCGC,TGATCCTTCTGCVGGTTCWCCTACG k=10 ordered=t mink=2 ktrim=l rcomp=f minlength=50 maxlength=155 tbo tpe Max memory cannot be determined. Attempting to use 1400 MB. If this fails, please add the -Xmx flag (e.g. -Xmx24g) to your command, or run this program qsubbed or from a qlogin session on Genepool, or set ulimit to an appropriate value. java -ea -Xmx1400m -Xms1400m -cp /usr/local/bin/current/ jgi.BBDuk in=sample14_S14_R1.fastq.gz in2=sample14_S14_R2.fastq.gz out=sample14_S14_R1_btrimmed.fastq.gz out2=sample14_S14_R1_btrimmed.fastq.gz literal=GTACACAMCGCCCGTCGC,TGATCCTTCTGCVGGTTCWCCTACG k=10 ordered=t mink=2 ktrim=l rcomp=f minlength=50 maxlength=155 tbo tpe Error: Could not find or load main class jgi.BBDuk Caused by: java.lang.ClassNotFoundException: jgi.BBDuk My current java version is Java 8 Update 191 |
![]() |
![]() |
![]() |
Tags |
adapter, bbduk, bbtools, cutadapt, trimmomatic |
Thread Tools | |
|
|