SEQanswers (
-   Bioinformatics (
-   -   bbsplit not using all reads in library (

ghd21 01-22-2020 08:42 PM

bbsplit not using all reads in library
I have RNA-seq files which I am wanting to split based on mapping to reference sequences. I am using bbsplit to map to the sequences and output separate mapping files however I noticed that not all reads in my files are mapped using this method. My read file has 9654349 reads but each time bbsplit only uses 6233783 reads - is there a way for me to force all reads to be mapped?

When I use kmer splitting in bbduk to map to only one my reference sequences all of the reads are used so I am wondering if there is a flag or something I am missing which will allow me to split based on multiple reference sequences at once.

Thanks for your help in advance!

GenoMax 01-23-2020 07:46 AM

Have you checked the options about what to do if reads are multi-mapping to more than one reference? I am going to hazard a guess that you just have some.

ghd21 01-23-2020 04:00 PM

Thanks for your reply! Ambiguous reads are just assigned to the first best site so I don't think that is the reason, it appears that not all the reads are attempting to be mapped? When I change the ambiguous flag the number of reads being mapped doesn't change, only where the reads are assigned, any ideas?

GenoMax 01-24-2020 04:11 AM

How much memory are you assigning to this job? Have these reads been scanned/trimmed before splitting?

Have you also looked at the output of these reports?

    scafstats=<file>    Write statistics on how many reads mapped to which scaffold to this file.
    refstats=<file>    Write statistics on how many reads were assigned to which reference to this file.
                        Unmapped reads whose mate mapped to a reference are considered assigned and will be counted.

All times are GMT -8. The time now is 03:04 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.