SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
BBSplit assertion error: invalid fasta file PinkTips Bioinformatics 17 04-04-2019 06:51 AM
Introducing BBSplit: Read Binning Tool for Metagenomes and Contaminated Libraries Brian Bushnell Bioinformatics 62 10-08-2018 02:48 AM
BBSplit result statistics s18692001 Bioinformatics 1 04-29-2018 04:54 PM
Several questions regarding BBMap/BBSplit MSchm Bioinformatics 2 12-18-2017 11:01 AM
IP library vs Input library reads number badribio Sample Prep / Library Generation 0 12-01-2014 01:27 PM

Reply
 
Thread Tools
Old 01-22-2020, 08:42 PM   #1
ghd21
Junior Member
 
Location: Australia

Join Date: Jan 2020
Posts: 2
Default bbsplit not using all reads in library

I have RNA-seq files which I am wanting to split based on mapping to reference sequences. I am using bbsplit to map to the sequences and output separate mapping files however I noticed that not all reads in my files are mapped using this method. My read file has 9654349 reads but each time bbsplit only uses 6233783 reads - is there a way for me to force all reads to be mapped?

When I use kmer splitting in bbduk to map to only one my reference sequences all of the reads are used so I am wondering if there is a flag or something I am missing which will allow me to split based on multiple reference sequences at once.

Thanks for your help in advance!
ghd21 is offline   Reply With Quote
Old 01-23-2020, 07:46 AM   #2
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,016
Default

Have you checked the options about what to do if reads are multi-mapping to more than one reference? I am going to hazard a guess that you just have some.
GenoMax is offline   Reply With Quote
Old 01-23-2020, 04:00 PM   #3
ghd21
Junior Member
 
Location: Australia

Join Date: Jan 2020
Posts: 2
Default

Thanks for your reply! Ambiguous reads are just assigned to the first best site so I don't think that is the reason, it appears that not all the reads are attempting to be mapped? When I change the ambiguous flag the number of reads being mapped doesn't change, only where the reads are assigned, any ideas?
ghd21 is offline   Reply With Quote
Old 01-24-2020, 04:11 AM   #4
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 7,016
Default

How much memory are you assigning to this job? Have these reads been scanned/trimmed before splitting?

Have you also looked at the output of these reports?
Code:
    scafstats=<file>    Write statistics on how many reads mapped to which scaffold to this file.
    refstats=<file>     Write statistics on how many reads were assigned to which reference to this file.
                        Unmapped reads whose mate mapped to a reference are considered assigned and will be counted.
GenoMax is offline   Reply With Quote
Reply

Tags
bbduk, bbmap, bbsplit, rna-seq

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:06 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO