SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   BFAST match error (http://seqanswers.com/forums/showthread.php?t=23647)

flobpf 09-26-2012 11:57 AM

BFAST match error
 
Hi,

I'm getting the following error from BFAST with my colorspace SOLiD reads in FASTQ format. Couldn't figure out what it is...

All my reads are >20bp and I'm using a subset of all reads to test the program here.
Quote:

$ bfast match -f Mus_musculus.GRCm38.68.dna_rm.toplevel.fa -r ../test.fastq -A 1
************************************************************
Checking input parameters supplied by the user ...
Validating fastaFileName Mus_musculus.GRCm38.68.dna_rm.toplevel.fa.
Validating readsFileName ../test.fastq.
Validating tmpDir path ./.
**** Input arguments look good!
************************************************************
************************************************************
Printing Program Parameters:
programMode: [ExecuteProgram]
fastaFileName: Mus_musculus.GRCm38.68.dna_rm.toplevel.f
a
mainIndexes [Auto-recognizing]
secondaryIndexes [Not Using]
readsFileName: ../test.fastq
offsets: [Using All]
loadAllIndexes: [Not Using]
compression: [Not Using]
space: [Color Space]
startReadNum: 1
endReadNum: 2147483647
keySize: [Not Using]
maxKeyMatches: 8
keyMissFraction: 1.000000
maxNumMatches: 384
whichStrand: [Both Strands]
numThreads: 1
queueLength: 250000
tmpDir: ./
timing: [Not Using]
************************************************************
Searching for main indexes...
Found 1 index (4 total files).
Not using secondary indexes.
************************************************************
Reading in reference genome from Mus_musculus.GRCm38.68.dna_rm.toplevel.fa.cs.br
g.
In total read 66 contigs for a total of 2730871774 bases
************************************************************
Reading ../test.fastq into a temp file.
Will process 250 reads.
************************************************************
Searching index file 1/4 (index #1, bin #1)...
Reading index from Mus_musculus.GRCm38.68.dna_rm.toplevel.fa.cs.1.1.bif.
bfast: ../bfast/RGIndex.c:2015: RGIndexReadHeader: Assertion `index->length > 0'
failed.
▒ ♥Aborted
Thought this might be due to me giving BFAST an incomplete dataset. However, if I use the entire dataset (all FASTQ SOLiD reads), I get the following error
Quote:

*** glibc detected *** bfast: malloc():> memory corruption: 0x000000000220bcd0 **
which seems like BFAST is running out of usable memory, when, in fact, I'm specifying 20gb of memory for a one lane of SOLiD FASTQ reads.

Any ideas on how to solve this problem?

Thanks

nilshomer 09-27-2012 08:57 PM

It looks like your index is corrupt, try rebuilding your indexes.

flobpf 10-01-2012 10:50 AM

Quote:

Originally Posted by nilshomer (Post 85148)
It looks like your index is corrupt, try rebuilding your indexes.

Hi nilshomer,

Thanks for the reply. I'm making the index using the bfast fasta2brg function while specifying -A 1. The genome is in base space, however I want to align colorspace reads to it. Am I using -A correctly? My rationale was that you want to align colorspace reads to colorspace genome, so -A 1 is the way to go. But maybe I'm mistaken...

Also, whats the correct way to specify the -m option for index? I have specified the following based on this post I saw on this forum
Code:

bfast fasta2brg -f Mus_musculus.GRCm38.68.dna_rm.toplevel.fa -A 1 -t out.tab

bfast index -f Mus_musculus.GRCm38.68.dna_rm.toplevel.fa -A 1 -d 1 -R -T indexTMP -t bfastindex_out.txt -m 10111111011001100011111000111111 -w 14

Thanks for your help

EDIT: added bfast index command

nilshomer 10-01-2012 06:01 PM

A few things, the "-t" option doesn't require an argument, and you have forgotten to create a base space version as well. See the bfast manual that comes with the distribution for examples (Chapter 7), as well as the command line options.
It's something like this:
Quote:

bfast fasta2brg -f Mus_musculus.GRCm38.68.dna_rm.toplevel.fa -A 0 -t
bfast fasta2brg -f Mus_musculus.GRCm38.68.dna_rm.toplevel.fa -A 1 -t
bfast index -f Mus_musculus.GRCm38.68.dna_rm.toplevel.fa -A 1 -T indexTMP -t -m 10111111011001100011111000111111 -w 14

simsalabim 03-13-2013 04:48 AM

Hi,
I realize this thread is older but I get a similar error message while using bfast for colorspace alignment.

My data consists of colorreads of length 75, dynamically trimmed down to >30 in case of bad sequencing quality. But the majority of reads still has length 75.

When I use the 10 masks from the bfast manual to build 10 (primary) indexes, the alignment works fine. But many of the shorter trimmed reads are not aligned. So I used 10 more masks to build indexes for shorter reads, which I want to use as secondary indexes. They should only be used for unaligned, (= mostly trimmed) reads. Right?
Anyway, when I try to run bfast as follows:
Quote:

bfast match -f $reference -i 1,2,3,4,5,6,7,8,9,10 -I 11,12,13,14,15,16,17,18,19,20 -r $infile -w 0 -n $nc -A 1 -z -t
I receive the error:
Quote:

Copying unmatched reads for secondary index search.
Splitting unmatched reads into temp files.
*** glibc detected *** bfast: double free or corruption (!prev): 0x000000000065f290 ***
I rebuild all indexfiles, but it didn't have any effect.

Is it not possible to use this many indexes? Everything works fine if I only use 10 primary ones... Or doesn't it make sense to use this combination of indexes since bfast is not designed to align reads with variable lengths?

Does anybody have suggestions what I did wrong? Thanks a lot in advance...

flobpf 03-13-2013 07:22 AM

I don't recollect how my problem got solved, but probably the solution was changing the version of BFAST. My other glibc problems have certainly been solved by changing the version of the program in question.

simro 03-13-2013 07:33 AM

Thank you for the reply!
I am running the current version of bfast. Are you using an older one, if so, which one works for you?


All times are GMT -8. The time now is 01:53 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.