View Single Post
Old 02-11-2019, 06:43 AM   #1
PinkTips
Junior Member
 
Location: Athens, GA

Join Date: Feb 2019
Posts: 7
Default BBSplit assertion error: invalid fasta file

Good morning, BBMappers!

I have been trying to run BBSplit (on my university's computing cluster) to remove host sequences from metatranscriptome data of a gut community.

This is the command I am using:
Code:
/home/hd55218/BBSplit/bbmap/bbsplit.sh in=/home/hd55218/BBSplit/QualTrimmed_bran11.fasta ref=/home/hd55218/BBSplit/p.americana_genome.fasta,/home/hd55218/BBSplit/Blattabacterium_genome.fasta basename=out_%.fasta outu=/home/hd55218/BBSplit/cleaned_bran11.fasta
The error message returned after running on the cluster is :
Code:
Exception in thread "main" java.lang.AssertionError: Invalid input file: '/home/hd55218/BBSplit/QualTrimmed_bran11.fasta'
        at align2.AbstractMapper.preparse0(AbstractMapper.java:821)
        at align2.AbstractMapper.<init>(AbstractMapper.java:53)
        at align2.BBMap.<init>(BBMap.java:43)
        at align2.BBMap.main(BBMap.java:31)
        at align2.BBSplitter.main(BBSplitter.java:47)
The first four sequences in my FASTA file appear as:
Code:
>NB502039:96:HGLYGBGX3:1:11101:19340:2795 1:N:0:AGTTCC
GTCCTCTTCCGGGGTCTGGGTGCCAAGGCCCATCGCCTGCAGACCTTCGTTCAGCGGGGTGTACACGGGGCCTTCGAATGCGCCATCGATGACCACGGTCGTCTTGTCATACTCGTTGCCGAAGTTCGCCATTTCGATCTGCAGCGGCTCCAGATCCAGCGTGGTGTAGTCGATGTCCACACGGCTGGGGGGGGGCACGCCGCCGGTGACGAGCCTGTAGGTCTGGCACTCCCC
>NB502039:96:HGLYGBGX3:1:11101:23904:2797 1:N:0:AGTTCC
CCGCCTTCAACGCCAAGAGCGCGAATTATGCGTATAGATGCACTTCTAAGCATCATGAGTTCTCTATCAGAAAGTGTTTGCGCAGGAGCTGCAACTATACTGTCACCTGTATGAACACCAACAGGGTCAAGGTTTTCCATCGAACAAATCGTAATGCAGTTATCTGCG
>NB502039:96:HGLYGBGX3:1:11101:16907:2810 1:N:0:AGTTCC
GGCACCGAACGCCTTGGCAGCCAAAGCCATAGCCGGCACGAACTGACGGTCGCCGACCGTCTTGCCGCCGCCCGCTCCGGGACGCTGCACCGAGTGGGTACAGTCCATTATCACGCGTGGCGTTATCTGCTTCATATCGGGAATATTGCGGAAATCAACCACCAAGTTATTGTACCCGAAGCTGTTGCCTCGCTCTATCAACCACACGTTTTCGTTACCGCTCTCGCGCACTTTCTGCACGG
>NB502039:96:HGLYGBGX3:1:11101:20216:2823 1:N:0:AGTTCC
TAAAGGCAAATGGCTCTATCATGAAATCCTGGAGCCGGGCGTGTTGGTGCATGTTTCTGAGAGCGGTGCCAAAGTATGGACCGTTCGCTGTGGTTCCCCCCGTCTGGTAACGGTCAATTATGTTCGCG
This FASTA file was converted from a FASTQ file using:
Code:
paste - - - - < Qualtrimmed_bran11.fastq | cut -f 1,2 | sed 's/^@/>/' | tr "\t" "\n" > Qualtrimmed_bran11.fasta
I am stumped as to why my FASTA format is invalid, so any thoughts/help would be greatly appreciated! Thanks!

Last edited by GenoMax; 03-21-2019 at 11:36 AM.
PinkTips is offline   Reply With Quote