SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Bowtie, an ultrafast, memory-efficient, open source short read aligner Ben Langmead Bioinformatics 515 04-24-2021 07:49 AM
Introducing BBMap, a new short-read aligner for DNA and RNA Brian Bushnell Bioinformatics 24 07-07-2014 10:37 AM
Miso's open source joyce kang Bioinformatics 1 01-25-2012 07:25 AM
Targeted resequencing - open source stanford_genome_tech Genomic Resequencing 3 09-27-2011 04:27 PM
EKOPath 4 going open source dnusol Bioinformatics 0 06-15-2011 02:10 AM

Reply
 
Thread Tools
Old 05-11-2021, 01:03 PM   #681
pck0
Junior Member
 
Location: United States

Join Date: May 2021
Posts: 1
Default Is mapping stochastic?

Hey, did a little test run mapping sequences against a reference fasta that contains two identical sequences called >one and >two, then repeated the process with the sequences renamed to >three and >four. The amount of reads mapping to each were as follows:

1st run:

>one: 47,699
>two: 330

2nd run:

>three: 47,688
>four: 338

BBmap options were all default, just minidentity = 90 and T=12

How come that (1) BBmap apparently misses some reads that map on the first sequence and then maps them on the second, identical sequence, and (2) how come the runs give different results?

Just curious, my apologies if this was addressed elsewhere!

cheers
pck0 is offline   Reply With Quote
Old 05-12-2021, 06:10 AM   #682
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,131
Default

Quote:
Originally Posted by pck0 View Post
Hey, did a little test run mapping sequences against a reference fasta that contains two identical sequences called >one and >two, then repeated the process with the sequences renamed to >three and >four. The amount of reads mapping to each were as follows:

1st run:

>one: 47,699
>two: 330

2nd run:

>three: 47,688
>four: 338

BBmap options were all default, just minidentity = 90 and T=12

How come that (1) BBmap apparently misses some reads that map on the first sequence and then maps them on the second, identical sequence, and (2) how come the runs give different results?

Just curious, my apologies if this was addressed elsewhere!

cheers
Most NGS aligners are non-deterministic i.e. they will not produce exactly identical results if run multiple times.

Fortunately, BBMap does have an option to run in deterministic mode.
Code:
deterministic=f         Run in deterministic mode.  In this case it is good
                        to set averagepairdist.  BBMap is deterministic
                        without this flag if using single-ended reads,
                        or run singlethreaded.
You could also run the analysis using just a single thread.
GenoMax is offline   Reply With Quote
Old 09-06-2021, 01:14 AM   #683
mewu3
Junior Member
 
Location: Nowhere

Join Date: Sep 2021
Posts: 4
Question bbmap.sh unfished job

Hello, Brian,

I am wondering whether bbmap.sh could resume an unfished job.

mewu3
mewu3 is offline   Reply With Quote
Old 09-07-2021, 09:30 AM   #684
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,131
Default

@mewu3: Unfortunately it can't. You will need to restart the job.

Quote:
Originally Posted by mewu3 View Post
Hello, Brian,

I am wondering whether bbmap.sh could resume an unfished job.

mewu3
GenoMax is offline   Reply With Quote
Old 09-10-2021, 01:13 AM   #685
mewu3
Junior Member
 
Location: Nowhere

Join Date: Sep 2021
Posts: 4
Question [help]

Hello,

I am using bbmap on HPC and I get the fallowing message :

Quote:
Aligning C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001 reads to fasta file ...
java -Djava.library.path=/opt/apps/bbtools-37.97/jni/ -ea -Xmx50G -cp /opt/apps/bbtools-37.97/current/ align2.BBWrap build=1 overwrite=true fastareadlen=500 build=1 in1=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R1_trim.fastq.gz,C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_se_trim.fastq.gz in2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R2_trim.fastq.gz,null trimreaddescriptions=t outm=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_aligned.sam outu1=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R1.sam,C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_se.sam outu2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R2.sam threads=8 pairlen=1000 pairedonly=t minid=0.9 mdtag=t xstag=fs nmtag=t sam=1.3 ambiguous=best secondary=t saa=f maxsites=10 -Xmx50G
Executing align2.BBWrap [build=1, overwrite=true, fastareadlen=500, build=1, in1=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R1_trim.fastq.gz,C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_se_trim.fastq.gz, in2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R2_trim.fastq.gz,null, trimreaddescriptions=t, outm=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_aligned.sam, outu1=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R1.sam,C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_se.sam, outu2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R2.sam, threads=8, pairlen=1000, pairedonly=t, minid=0.9, mdtag=t, xstag=fs, nmtag=t, sam=1.3, ambiguous=best, secondary=t, saa=f, maxsites=10, -Xmx50G]

Executing align2.BBMap [build=1, overwrite=true, fastareadlen=500, build=1, trimreaddescriptions=t, threads=8, pairlen=1000, pairedonly=t, minid=0.9, mdtag=t, xstag=fs, nmtag=t, sam=1.3, ambiguous=best, secondary=t, saa=f, maxsites=10, in=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R1_trim.fastq.gz, outu=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R1.sam, outm=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_aligned.sam, in2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R2_trim.fastq.gz, outu2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R2.sam]
Version 37.97 [build=1, overwrite=true, fastareadlen=500, build=1, trimreaddescriptions=t, threads=8, pairlen=1000, pairedonly=t, minid=0.9, mdtag=t, xstag=fs, nmtag=t, sam=1.3, ambiguous=best, secondary=t, saa=f, maxsites=10, in=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R1_trim.fastq.gz, outu=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R1.sam, outm=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_aligned.sam, in2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_R2_trim.fastq.gz, outu2=C_CGTCTGCG-ATTGTGAA-AHGKC2BBXY_L001_unaligned_R2.sam]

Set threads to 8
Retaining first best site only for ambiguous mappings.
Set MINIMUM_ALIGNMENT_SCORE_RATIO to 0.816
Set genome to 1

Loaded Reference: 3.463 seconds.
Loading index for chunk 1-1, build 1
Generated Index: 2.556 seconds.
Analyzed Index: 7.028 seconds.
Started output stream: 0.043 seconds.
Exception in thread "main" java.lang.AssertionError: Attempting to output paired reads to different sam files.
at stream.ReadStreamWriter.<init>(ReadStreamWriter.java:51)
at stream.ReadStreamByteWriter.<init>(ReadStreamByteWriter.java:17)
at stream.ConcurrentGenericReadOutputStream.<init>(ConcurrentGenericReadOutputStream.java:40)
at stream.ConcurrentReadOutputStream.getStream(ConcurrentReadOutputStream.java:52)
at stream.ConcurrentReadOutputStream.getStream(ConcurrentReadOutputStream.java:29)
at align2.AbstractMapper.openStreams(AbstractMapper.java:873)
at align2.BBMap.testSpeed(BBMap.java:437)
at align2.BBMap.main(BBMap.java:34)
at align2.BBWrap.execute(BBWrap.java:144)
at align2.BBWrap.main(BBWrap.java:22)
I get the impression that bbmap is stuck on something and i don't know what's wrong with it. Please help !

mewu3
mewu3 is offline   Reply With Quote
Old 09-10-2021, 09:04 AM   #686
HESmith
Senior Member
 
Location: Bethesda MD

Join Date: Oct 2009
Posts: 511
Default

"Exception in thread "main" java.lang.AssertionError: Attempting to output paired reads to different sam files."

Typically, BBMap tools keep paired reads together. You're attempting to write aligned and unaligned reads to separate files, which violates that function.
HESmith is offline   Reply With Quote
Old 09-10-2021, 10:27 AM   #687
kesmarl
Junior Member
 
Location: Europe

Join Date: Sep 2021
Posts: 1
Default

Thank you very so much !

I downloaded it
__________________
Hello from France
kesmarl is offline   Reply With Quote
Old 09-11-2021, 05:40 AM   #688
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,131
Default

@mewu3: Since paired-end reads are aligned together you should use a single "out=output.sam". If you wanted to capture unmapped reads into separate files then you would want to do that as "outu1=R1.unmapped.fq outu2=R2.unmapped.fq".

You may be able to write them out as an unmapped sam file "outu=unmapped.sam" but then again you should use only one output for that. This is untested.
GenoMax is offline   Reply With Quote
Old 09-14-2021, 09:18 AM   #689
mewu3
Junior Member
 
Location: Nowhere

Join Date: Sep 2021
Posts: 4
Default

Quote:
Originally Posted by HESmith View Post
"Exception in thread "main" java.lang.AssertionError: Attempting to output paired reads to different sam files."

Typically, BBMap tools keep paired reads together. You're attempting to write aligned and unaligned reads to separate files, which violates that function.
Thank you !
mewu3 is offline   Reply With Quote
Old 09-15-2021, 12:45 AM   #690
mewu3
Junior Member
 
Location: Nowhere

Join Date: Sep 2021
Posts: 4
Question pileup.sh explication

Hello,

Can some please kindly explain the output file of pileup.sh ?
  • basecov
  • bincove
  • covstat
How the coverage is calculated ?
mewu3 is offline   Reply With Quote
Old 09-22-2021, 06:39 PM   #691
lmusgrove
Junior Member
 
Location: Australia

Join Date: Sep 2021
Posts: 1
Smile Finding mapped rate for rpkm output

Hi Brian,

I'm running BBMap with the rpkm output option and would like to know how to see mapped rate for each read file. I run multiple files consecutively with nohup. Here's my code:

bbmap.sh ref=data/Assembly.fasta \
in1=data/clean/A_1.clean.fq.gz \
in2=data/clean/A_2.clean.fq.gz \
rpkm=data/fpkm/A.fpkm \
t=5 &

bbmap.sh ref=data/Assembly.fasta \
in1=data/clean/B_1.clean.fq.gz \
in2=data/clean/B_2.clean.fq.gz \
rpkm=data/fpkm/B.fpkm \
t=5

etc etc

So far it's worked really well. However, while the stdout file shows the mapped rates it doesn't tell me which read files relate to which stats. It just has a number of repeated --- Results 1 ---- records and I don't know which is which.

Is there a flag I need to add to ensure I can see which stats are for which files?

Thanks!

Lisa
lmusgrove is offline   Reply With Quote
Old 09-23-2021, 03:06 PM   #692
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,131
Default

Quote:
Originally Posted by lmusgrove View Post
Hi Brian,

I'm running BBMap with the rpkm output option and would like to know how to see mapped rate for each read file. I run multiple files consecutively with nohup. Here's my code:

bbmap.sh ref=data/Assembly.fasta \
in1=data/clean/A_1.clean.fq.gz \
in2=data/clean/A_2.clean.fq.gz \
rpkm=data/fpkm/A.fpkm \
t=5 &

bbmap.sh ref=data/Assembly.fasta \
in1=data/clean/B_1.clean.fq.gz \
in2=data/clean/B_2.clean.fq.gz \
rpkm=data/fpkm/B.fpkm \
t=5

etc etc

So far it's worked really well. However, while the stdout file shows the mapped rates it doesn't tell me which read files relate to which stats. It just has a number of repeated --- Results 1 ---- records and I don't know which is which.

Is there a flag I need to add to ensure I can see which stats are for which files?

Thanks!

Lisa
You should be able to use
Code:
covstats=<file>         Per-scaffold coverage info.
for each of your commands. You can also capture the stderr/out to a file to get statistics. https://linuxize.com/post/bash-redirect-stderr-stdout/
GenoMax is offline   Reply With Quote
Old 09-29-2021, 12:40 AM   #693
silask
Junior Member
 
Location: Sitzerland

Join Date: Oct 2017
Posts: 8
Default bbmap ignores minid

bbmap ignores minid parameter

This is for version 38.93 and 38.92


https://sourceforge.net/projects/bbmap/

Last edited by silask; 09-29-2021 at 12:44 AM.
silask is offline   Reply With Quote
Old 10-13-2021, 12:30 PM   #694
reliscu
Junior Member
 
Location: USA

Join Date: May 2021
Posts: 7
Default All "N" reads in paired reads not being filtered if other read is "good"

Version 38.94

This thread details the "bug":
https://www.biostars.org/p/9493307/#9493339
reliscu is offline   Reply With Quote
Reply

Tags
bbmap, metagenomics, rna-seq aligners, short read alignment

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:15 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO