SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Quality-, adapter- and RRBS-trimming with Trim Galore! fkrueger Bioinformatics 132 04-18-2017 01:04 AM
Adapter trimming figo1019 RNA Sequencing 1 04-07-2014 10:58 AM
Adapter trimming and trimming by quality question alisrpp Bioinformatics 5 04-08-2013 04:55 PM
adapter trimming - help a_mt Bioinformatics 6 11-12-2012 07:36 PM
3' Adapter Trimming caddymob Bioinformatics 0 05-27-2009 12:53 PM

Reply
 
Thread Tools
Old 05-05-2017, 12:07 PM   #221
BrianS
Junior Member
 
Location: East Coast, USA

Join Date: May 2017
Posts: 2
Default

That makes sense. Thank you.
BrianS is offline   Reply With Quote
Old 05-06-2017, 03:07 PM   #222
Torst
Senior Member
 
Location: The University of Melbourne, AUSTRALIA

Join Date: Apr 2008
Posts: 275
Default

This behaviour is a bit un-Unix like?


bbduk.sh in1=R1.fq.gz in2=R2.fq.gz loglog loglogk=31 out=/dev/null

Unspecified format for output /dev/null; defaulting to fastq.

Exception in thread "main" java.lang.AssertionError: /dev/null already exists; please delete it.
Torst is offline   Reply With Quote
Old 05-06-2017, 04:26 PM   #223
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,292
Default

Quote:
Originally Posted by Torst View Post
This behaviour is a bit un-Unix like?
@Brian will have a more official answer but BBTools are pure Java and are coded to be OS agnostic (will run on any OS with Java).

Not specifying an "out" option with most BBTools produces all statistics without result output (giving you out=/dev/null effect).
GenoMax is offline   Reply With Quote
Old 05-06-2017, 04:27 PM   #224
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,553
Default

Haha

The syntax would be:

bbduk.sh in1=R1.fq.gz in2=R2.fq.gz loglog loglogk=31 out=stdout.fq > /dev/null/

But, you don't need to specify anything, as the default is to not print anything rather than writing to stdout, so just do this:

bbduk.sh in1=R1.fq.gz in2=R2.fq.gz loglog loglogk=31

Edit: @Genomax beat me by a minute
Brian Bushnell is offline   Reply With Quote
Old 05-17-2017, 12:46 PM   #225
phylloxera
Junior Member
 
Location: Nebraska

Join Date: May 2017
Posts: 2
Default Getting different results with bbduk command line vs geneious plugin

Hi,

I have been searching the default settings in the command line and still haven't identified the source of the discrepancy... Here is my linux command:

sh ~/bbmap/bbduk.sh in1=~/path/to/forwards.fastq.gz in2=~/path/to/reverses.fastq.gz out=~/path/to/output.fastq.gz ref=~/bbmap/resources/adapters.fa ktrim=r k=23 mink=11 hdist=1 minoverlap=24 tbo

result: 3628348 reads, 581467350 bases

and here is my plugin command from the geneious output:
java.exe -ea -Xmx100m -cp ...\currenjgi.BBDukF ktrim=r k=23 hdist=1 edist=0 mink=11 ref=adapters.fa minlength=10 trimbyoverlap=t minoverlap=24 qin=33 in=input1.fastq in2=input2.fastq out=output1.fastq out2=output2.fastq

result: 3628348 reads, 584214527 bases

the plugin command seems compatible with my data and the defaults in bbduk.sh. Any idea why 3M more bases in the plugin?

Thanks, Aaron
phylloxera is offline   Reply With Quote
Old 05-17-2017, 02:08 PM   #226
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,553
Default

No idea; those commands should be equivalent, unless the adapters.fa file is different. Can you post the full output of the command so I can see where the bases were lost? Also, you may want to add "stats=stats.txt" which will indicate which adapter sequences are being hit, and should be identical in both cases.
Brian Bushnell is offline   Reply With Quote
Old 05-17-2017, 03:41 PM   #227
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,292
Default

Perhaps geneious is using an adapters.fa file that is different?
GenoMax is offline   Reply With Quote
Old 05-18-2017, 06:06 AM   #228
phylloxera
Junior Member
 
Location: Nebraska

Join Date: May 2017
Posts: 2
Default re: different results from command line and geneious plugin

Yes, the adapters files are different. Geneious says using All Truseq, Nextera, and PhiX adapters (152) sequences. The count in the resources folder of bbmap is 154 sequences. I don't know which 2 are lacking from geneious, presuming the shared set is 152. I could not find the adapters file geneious is calling... or the stats.txt file I asked it to produce. May need to call in Geneious support to help with this. Among the top stats.txt hits in the linux output are pcr dimer (16% of reads) and pcr primers (15% of reads). I wonder if these are lacking in geneious... Geneious is trimming more sequences by overlap 206,591 vs 157,344 and fewer sequences by ktrim 2,630,578 vs 2,657,654. Now... which one is best? The one trimming more?
phylloxera is offline   Reply With Quote
Old 05-18-2017, 11:16 AM   #229
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,553
Default

Probably the one trimming more is better. Geneious is probably trimming more by overlap because overlap trimming happens after adapter-sequence trimming, so if some adapter sequences are missing, those will (usually) get overlap-trimmed rather than sequence-trimmed. But it's hard to say. You could align the first 1m reads to the reference (if you have one) and look at the mapping rates and error rates to see which dataset is better.
Brian Bushnell is offline   Reply With Quote
Reply

Tags
adapter, bbduk, bbtools, cutadapt, trimmomatic

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 04:35 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2017, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO