SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Introducing BBMap, a new short-read aligner for DNA and RNA Brian Bushnell Bioinformatics 24 07-07-2014 09:37 AM
tophat/cufflinks with different read lengths libraries linsson RNA Sequencing 0 07-03-2012 10:21 AM
Mate pairs contaminated with paired ends - impact on assembly? reithme Bioinformatics 2 12-13-2009 11:35 PM

Reply
 
Thread Tools
Old 06-01-2018, 10:09 AM   #61
rajarapupriya
Member
 
Location: OH, USA

Join Date: Oct 2013
Posts: 18
Default

Thanks for your quick reply. I ran a test run with both the references in the same command.
rajarapupriya is offline   Reply With Quote
Old 08-20-2018, 05:59 AM   #62
kcamnairb
Junior Member
 
Location: New Orleans, LA

Join Date: Sep 2013
Posts: 3
Default

Hi Brian,

I'm trying to use bbsplit to separate rnaseq reads from two mixed fungal samples. I'm using the individual transcriptomes as references. I was getting some unexpected results. It seemed that more reads were unambiguously mapping to the reference that is listed first, so I swapped the order of the references and the results changed dramatically. I have ambiguous2=toss, but it seems like it's still using the first best site. Below are my commands and refstats output. Is there anything I'm doing wrong?

Thanks,
Brian
Code:
bbsplit.sh ref=53.fasta,17.fasta \
        in=53_30_r1_S7_R1_001.fastq.gz in2=53_30_r1_S7_R2_001.fastq.gz \
        out_17=map17_53_30_r1_S7_R#_001.fastq.gz \
        out_53=map53_53_30_r1_S7_R#_001.fastq.gz \
        refstats=53_30_r1_S7.stats ambiguous2=toss

#name	%unambiguousReads	unambiguousMB	%ambiguousReads	ambiguousMB	unambiguousReads	ambiguousReads
53	41.51013	1625.01508	57.30665	2219.25878	11241396	15519266
17	1.13394	44.03152	57.30665	2219.25878	307084	15519266        
        
bbsplit.sh ref=17.fasta,53.fasta \
        in=53_30_r1_S7_R1_001.fastq.gz in2=53_30_r1_S7_R2_001.fastq.gz \
        out_17=map17_53_30_r1_S7_R#_001.fastq.gz \
        out_53=map53_53_30_r1_S7_R#_001.fastq.gz \
        refstats=53_30_r1_S7.stats2 ambiguous2=toss

#name	%unambiguousReads	unambiguousMB	%ambiguousReads	ambiguousMB	unambiguousReads	ambiguousReads
53	21.37940	838.36051	67.54242	2623.22348	5789774	18291224
17	11.02890	426.72088	67.54242	2623.22348	2986746	18291224

Last edited by GenoMax; 08-20-2018 at 08:03 AM.
kcamnairb is offline   Reply With Quote
Old 10-08-2018, 01:48 AM   #63
phuongbigbig
Junior Member
 
Location: Hanoi

Join Date: Aug 2015
Posts: 1
Default Contamination from human genome?

Hi,

I am working on non-model fish RNA-seq data, I am considering remove human contamination from reads, is this feasible since there is number of orthologs between human and fish?
Is there any recommendation regarding choice of "-minratio" for this case? It seems that 0.56 maybe too low? (I don't have reference genome for this non-model fish, by the way)

P.s: I think there should be different usage strategy of sensitivity or specificity for the case of binning (having 2 reference, i.e host vs contaminant, both have comparative alignment score to judge) AND for the case of decontaminating (only have the reference of contaminant, judgement only based on alignment to contaminant reference).

Thank you very much for your suggestion !
phuongbigbig is offline   Reply With Quote
Old 03-27-2020, 12:49 PM   #64
ahurley2
Junior Member
 
Location: Madison, WI

Join Date: Mar 2020
Posts: 1
Default Question about BBsplit ambig2=toss and bam files

Hello!

I am using BBsplit to separate reads from a paired-end three-species bacterial RNASeq project. I set the flag ambig2=toss but then see this sentence in the print out for the code:

"Retaining first best site only for ambiguous mappings."

To me, that looks like default ambiguous=best. Is that what I should be seeing? How do I know if the ambiguous reads are being tossed?

Additionally, I am mapping directly into a bam file. From earlier posts, looks like BBsplit bam files are incompatible with IGV but would they be okay with a feature counter like HTseq or edgeR?

Thanks very much,
Amanda
ahurley2 is offline   Reply With Quote
Old 03-28-2020, 03:54 AM   #65
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,049
Default

@Amanda: I will need to dig through some past correspondence with Brian but I think he had recommended splitting first and then mapping to avoid the problem of having all references present in the BAM file. Which indeed causes issues with visualization programs.

If you look at the in-line help for "ambiguous2" you can see what it is doing:
Code:
ambiguous2=<best>    Set behavior only for reads that map ambiguously to multiple different references.
                     Normal 'ambiguous=' controls behavior on all ambiguous reads;
                     Ambiguous2 excludes reads that map ambiguously within a single reference.
GenoMax is offline   Reply With Quote
Reply

Tags
aligners, bbsplit, binning, contaminant, metagenome

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:43 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO