SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Bowtie, an ultrafast, memory-efficient, open source short read aligner Ben Langmead Bioinformatics 513 05-14-2015 02:29 PM
Introducing BBMap, a new short-read aligner for DNA and RNA Brian Bushnell Bioinformatics 24 07-07-2014 09:37 AM
Miso's open source joyce kang Bioinformatics 1 01-25-2012 06:25 AM
Targeted resequencing - open source stanford_genome_tech Genomic Resequencing 3 09-27-2011 03:27 PM
EKOPath 4 going open source dnusol Bioinformatics 0 06-15-2011 01:10 AM

Reply
 
Thread Tools
Old 07-25-2018, 04:24 PM   #641
olgabot
Junior Member
 
Location: San Francisco, CA

Join Date: Apr 2014
Posts: 1
Default Add hg19 masked reference to distribution

Hello,
I'm using BBTools via bioconda and the corresponding quay.io docker container. The image has the necessary resources, e.g. the adapters fasta file:

Code:
(base) 
 Wed 25 Jul - 17:10  ~/code/tick-genome/reflow   origin ☊ master 9☀ 1● 
  docker run -it -v $PWD:/data quay.io/biocontainers/bbmap:38.06--2 bash
bash-4.2# find . -name adapters.fa
./usr/local/opt/bbmap-38.06/resources/adapters.fa
bash-4.2# cd ./usr/local/opt/bbmap-38.06/resources
bash-4.2# ll
bash: ll: command not found
bash-4.2# ls 
adapters.fa                          blacklist_silva_species_500.sketch   lambda.fa.gz                         nextera_LMP_linker.fa.gz             primes.txt.gz                        sequencing_artifacts.fa.gz
adapters_no_transposase.fa.gz        contents.txt                         lfpe.linker.fa.gz                    pJET1.2.fa                           remote_files.txt                     short.fa
blacklist_img_species_300.sketch     crelox.fa.gz                         mtst.fa                              phix174_ill.ref.fa.gz                remote_files_old.txt                 truseq.fa.gz
blacklist_nt_species_1000.sketch     favicon.ico                          nextera.fa.gz                        phix_adapters.fa.gz                  sample1.fq.gz                        truseq_rna.fa.gz
blacklist_refseq_species_250.sketch  kapatags.L40.fa                      nextera_LMP_adapter.fa.gz            polyA.fa.gz                          sample2.fq.gz
However, the removehuman.sh script uses a hardcoded path for the masked human genome posted in the RemoveHuman thread.


Code:
	local CMD="java -Djava.library.path=$NATIVELIBDIR $EA $z -cp $CP align2.BBMap minratio=0.9 maxindel=3 bwr=0.16 bw=12 quickmatch fast minhits=2 path=/global/projectb/sandbox/gaag/bbtools/hg19 pigz unpigz zl=6 qtrim=r trimq=10 untrim idtag usemodulo printunmappedcount usejni ztd=2 kfilter=25 maxsites=1 k=14 [email protected]
Can the masked genome be included in the distribution?

Thank you!
Warmest,
Olga
olgabot is offline   Reply With Quote
Old 08-07-2018, 07:45 AM   #642
sunnycqcn
Member
 
Location: Canada

Join Date: Apr 2013
Posts: 17
Default

Hello Brian,
After running mapPacBio.sh, how can I combine the sequence of the same ID?
for example I want to combine the sequences as following:
m151006_234406_42219_c100867912550000001823195203031665_s1_p0/110457/57769_70466 id=3_0_part_2_6
m151006_234406_42219_c100867912550000001823195203031665_s1_p0/110457/57769_70466 id=3_0_part_3

Thanks,
Fuyou
sunnycqcn is offline   Reply With Quote
Old 08-09-2018, 05:56 AM   #643
JenBarb
Member
 
Location: Bethesda, MD

Join Date: Oct 2010
Posts: 47
Default pull out sequences with matching primers

Hi Brian,
I was wondering if bbmap has a tool that will pull out reads matching a particular primer sequences? I have fastq files with amplicons from 12 different primers in the same file so i want to make subsets of the reads having specific primers of interest from this.

i have used your tool for other tasks so i figured I would ask if it also has this capability?

Thank you,
Jen
JenBarb is offline   Reply With Quote
Old 08-09-2018, 06:08 AM   #644
HESmith
Senior Member
 
Location: Bethesda MD

Join Date: Oct 2009
Posts: 498
Default

@JenBarb see this thread in Biostars.
HESmith is offline   Reply With Quote
Old 08-09-2018, 07:09 AM   #645
JenBarb
Member
 
Location: Bethesda, MD

Join Date: Oct 2010
Posts: 47
Default

Thank you! Love the tool!
JenBarb is offline   Reply With Quote
Old 08-14-2018, 09:05 PM   #646
Meyana
Member
 
Location: Japan

Join Date: Sep 2017
Posts: 24
Default

Hi,
Hoping somebody can help me with this.

I used BBMap and now I would like to extract the reads from by .bam file that are split (/chimeric?) ie. reads that indicate a deletion.

I tried to use samblaster, but it doesn't recognize any reads as split...
(samtools view -h in.bam | samblaster -a -s split.sam -o /dev/null)
Are the split reads marked differently in BBMap compared to other aligners causing samblaster to fail?

IGV shows a good amount of reads with deletions and I can also call deletions using BBTools callvariants.sh - so I know they are in there. I just have a feeling callvariants is calling fewer deletions and with lower coverage than what IGV suggests, so I want to check up on it.
Meyana is offline   Reply With Quote
Old 08-15-2018, 09:45 AM   #647
JenBarb
Member
 
Location: Bethesda, MD

Join Date: Oct 2010
Posts: 47
Default mkf argument in bbduk.sh (bbmap tool)

Hello,
I am trying to use the flag mkf (minkmerfraction) and I am getting an error that that argument does not exist.
sh /data/barbj/bbmap/bbduk.sh in=./../Stool_001-01.fastq outm=v2fstoolfq.fa literal=CTCAAACTTGGGTAATTAAACC k=17 mkf=0.8
java -Djava.library.path=/data/barbj/bbmap/jni/ -ea -Xmx39767m -Xms39767m -cp /data/barbj/bbmap/current/ jgi.BBDukF in=./../Stool_001-01.fastq outm=v2fstoolfq.fa literal=CTCAAACTTGGGTAATTAAACC k=17 mkf=0.8
Executing jgi.BBDukF [in=./../Stool_001-01.fastq, outm=v2fstoolfq.fa, literal=CTCAAACTTGGGTAATTAAACC, k=17, mkf=0.8]

Exception in thread "main" java.lang.RuntimeException: Unknown parameter mkf=0.8
at jgi.BBDukF.<init>(BBDukF.java:402)
any ideas why this is not working?

Jen
JenBarb is offline   Reply With Quote
Old 08-23-2018, 06:44 AM   #648
ellybelly
Junior Member
 
Location: Europe

Join Date: Oct 2016
Posts: 2
Default bbmap aborts after mapping some reads

Hello Brian,

we are using bbmap to see in how far it is possible to quantify gene expression by mapping Illumina RNA-seq reads to the genome of a closely related species, e.g. map chimpanzee reads to human or as in this example Macaque reads.

To this end, we generated Macaque Illumina SE reads using flux-simulator and map them to
hg38 and for comparison we were also trying also Mmul8, downloaded from ensembl (wget ftp://ftp.ensembl.org/pub/release-92...toplevel.fa.gz).

Everything mapped fine to hg38, but not to Mmul8.

Exception in thread "Thread-12" java.lang.AssertionError
at align2.BBIndex.extendScore(BBIndex.java:2612)
at align2.BBIndex.slowWalk3(BBIndex.java:1389)
at align2.BBIndex.find(BBIndex.java:777)
at align2.BBIndex.find(BBIndex.java:623)
at align2.BBIndex.findAdvanced(BBIndex.java:400)
at align2.AbstractMapThread.quickMap(AbstractMapThread.java:750)
at align2.BBMapThread.processRead(BBMapThread.java:408)
at align2.AbstractMapThread.run(AbstractMapThread.java:508)


I tried to run on one thread, increased memory to 101G, removed small contigs of <100kb ... but the error message remains the same.

We are running a Debian system with java version "1.8.0_181" and have BBMap version 38.02 -- the detailed error output is in the attached file.

The false Mapping Rates of bbmap are so much better than for STAR & GSNAP, that we definitely want to use bbmap for our paper and we are nearly done all other species (marmoset, gorilla, chimpanzee and orangutan) and the simulations ran through -- the only missing piece is the mapping to the Mmul8.

Any help would be greatly appreciated.

Best, Ines
Attached Files
File Type: txt Mmul1.701837.txt (4.0 KB, 0 views)
ellybelly is offline   Reply With Quote
Old 09-07-2018, 09:24 AM   #649
raw937
Member
 
Location: BC

Join Date: Aug 2010
Posts: 18
Default bbmap for demultiplexing dual barcodes.

Hello,
I need it if possible to use dual indexes.

For example: In bold dual barcode

#R1 read
@SOLEXA1_0069_FC:3:1:1673:948#ACAGTG/1
GACTAACCGGATTAGATACCCTGGTAGTCCACGCCGTAAACGATGAATGTTAGCCGTCGGGCAGTATACTGTTCGG
+
BMMQNTWSWWb_____b_bb__________Y_________YYYYY[[[Y[__________XXRWXVVVVTYYYYYT

#R2 read
@SOLEXA1_0069_FC:3:1:1673:948#ACAGTG/2
CTGAAGGGTTGCGCTCGTTGCGGGACTTAACCCAACATCTCACGACACGAGCTGACGACAGCCATGCAGCACCTGT
+
ghgaggfghhhhhhhhhhghhhhhhhhhhfhhhghfWffch[hhgahhedffddR[^W^Zc^_cac[Wb]^W^

Here are 16 possible in the file I am working on.
TCAG-TCAG
CTGA-CTGA
TCAG-GACT
GACT-GACT
AGTC-AGTC
GACT-TCAG
GACT-AGTC
GACT-CTGA
TCAG-CTGA
AGTC-TCAG
AGTC-GACT
CTGA-AGTC
CTGA-GACT
AGTC-CTGA
TCAG-AGTC
CTGA-TCAG

The first four nts are the barcode like our example before would be:
GACT-CTGA_R1.fq
GACT-CTGA_R2.fq

But you would need both reads to tell you that it's GACT-CTGA and not something else.
What would the command look like for this? Does this demux script do the dual barcoding?
raw937 is offline   Reply With Quote
Old 09-25-2018, 08:01 AM   #650
juanita
Junior Member
 
Location: Europe

Join Date: Sep 2018
Posts: 2
Default ref input for BBMap and paired ends

I am sorry if this question is very basic but I am getting a low percentage of mapping reads to the reference genome, about the 36% of the pct reads mapped. Any clue what this is the case?

I am using as the reference genome the genome in scaffolds and paired-end reads...
juanita is offline   Reply With Quote
Old 09-25-2018, 09:47 AM   #651
SNPsaurus
Registered Vendor
 
Location: Eugene, OR

Join Date: May 2013
Posts: 451
Default

Quote:
Originally Posted by juanita View Post
I am sorry if this question is very basic but I am getting a low percentage of mapping reads to the reference genome, about the 36% of the pct reads mapped. Any clue what this is the case?

I am using as the reference genome the genome in scaffolds and paired-end reads...
Have you trimmed adapters away from the reads (short fragments will create reads that are part genomic and part adapter and may not map). You could use the related BBmap tool sendsketch to get a sense of what is in your reads (after trimming). When we do genotyping of samples, many samples have contaminating species...so using sendsketch can help figure out what is in there. You can input the entire fastq file with sendsketch, or go to read mose and get a result on a per read basis.

You can also grab 100 reads, turn them into fasta format and do blastn with them (if online use the blastn rather than megablast option) and see read by read what is in there.

Other options...your sample is not highly related to the reference, the reference may be incomplete and missing regions, the reference is lacking high copy repeat content like mtDNA or chloroplast and many reads go to those.
__________________
Providing nextRAD genotyping and PacBio sequencing services. http://snpsaurus.com
SNPsaurus is offline   Reply With Quote
Old 10-11-2018, 07:36 AM   #652
csmiller
Junior Member
 
Location: Denver, CO

Join Date: Mar 2009
Posts: 1
Default usejni and compiled C code in BBTools

I just installed the latest version of the BBTools (38.26), and I notice that the C code provided by the usejni=t flag for some tools has been depreciated / disabled.

I found this in the changelog:
Quote:
Removed JNI path flag from BBMerge, BBMap, and RQCFilter shell scripts.
and this in docs/compiling.txt:
Quote:
3) C code. This was developed by Jonathan Rood to accelerate BBMap, BBMerge, and Dedupe, but is currently disabled.
Sure enough, it is commented out in the bbmap.sh code:
Code:
        #local CMD="java -Djava.library.path=$NATIVELIBDIR $EA $z -cp $CP align2.BBMap build=1 overwrite=true fastareadlen=500 [email protected]"
        local CMD="java $EA $z -cp $CP align2.BBMap build=1 overwrite=true fastareadlen=500 [email protected]"
If I revert to the previous version of the CMD, with the java.library.path set, then the command runs with the compiled C code just fine.

Why was this disabled? Does this affect previous analyses that used this C code? That is, does the C code contain an error that means usejni=t in previous versions will produce different output than the java-only code? Or was this purely a performance or compatibility issue, or something else?

Sorry if I've missed this already posted somewhere, and thanks in advance for any help.

Chris
csmiller is offline   Reply With Quote
Reply

Tags
bbmap, metagenomics, rna-seq aligners, short read alignment

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:16 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO