 03-04-2015, 05:32 PM #21 Brian Bushnell Ah - in this case it's capitalization - should be BBMap, not BBmap. I should have caught that the first time.
 03-04-2015, 06:29 PM #22 everestial Thanks Brian.
03-05-2015, 08:04 AM   #23
everestial
Member

Location: North Carolina

Join Date: Feb 2015
Posts: 31

Hi Brian,

I was able to index the genome into one file. Looking at the summary it seems that it is chr1, mitochondrial and chlroplast genome. Thats fine for now. I may be able to add sequences from other chromosomes.

For now I tried to map RNAseq reads from one of my sample for the test purpose. I am not worried about splice right now. But, I have been getting errors. I am not sure what the errors mean.
I am posting screen shots.

Also, to allow splice junction (which I would expect for RNAseq reads) what should I do. I have been reading the readme file. I understand the concept but for some reason whole coding process still doesn't make very good sense to me.

I was able to do fastqc quality check on iplant. Is there any parameter in BBMap equivalent to this quality check app.

Thanks,
 03-05-2015, 08:08 AM #24 GenoMax Once you have the index files ready you use them with path=C:\bbmap\refreads directive (if that path is right). Do not use ref=.
03-05-2015, 08:23 AM   #25
GenoMax
Senior Member

Location: East Coast USA

Join Date: Feb 2008
Posts: 6,975

Quote:
 Originally Posted by everestial Also, to allow splice junction (which I would expect for RNAseq reads) what should I do. I have been reading the readme file. I understand the concept but for some reason whole coding process still doesn't make very good sense to me.
Here is a post to help understand minimum options you could use: http://seqanswers.com/forums/showpos...68&postcount=3

03-05-2015, 08:32 AM   #26
everestial
Member

Location: North Carolina

Join Date: Feb 2015
Posts: 31

Seems like there is some problem with java. I am not sure.

First, I deleted the previous index file and created a new one. The cmd prompt window on the left. There is an error message at the bottom saying
But, a ref folder is created in G: and there are fasta index files.

So, I proceed with mapping one of my RNAseq sample
As, the reference had already been set in the previous command I just run align2.BBmap with in=samplename.fasta and out=testF1.fasta
But, still getting the same error.
03-05-2015, 08:39 AM   #27
everestial
Member

Location: North Carolina

Join Date: Feb 2015
Posts: 31

Quote:
 Originally Posted by GenoMax Here is a post to help understand minimum options you could use: http://seqanswers.com/forums/showpos...68&postcount=3 Read through the thread for several additional pointers that Brian has provided for additional questions.
Well that command for RNAseq reads make a perfect sense. I will follow up with this command and see how my results turn out.

Thanks,

03-05-2015, 08:52 AM   #28
GenoMax
Senior Member

Location: East Coast USA

Join Date: Feb 2008
Posts: 6,975

Quote:
 Originally Posted by everestial Seems like there is some problem with java. I am not sure. First, I deleted the previous index file and created a new one. The cmd prompt window on the left. There is an error message at the bottom saying Exception in thread......... But, a ref folder is created in G: and there are fasta index files. So, I proceed with mapping one of my RNAseq sample As, the reference had already been set in the previous command I just run align2.BBmap with in=samplename.fasta and out=testF1.fasta But, still getting the same error.
You probably want to create a separate folder when you make the index files that way it would be less confusing to use that path in your ref= part.

Can you post the exact commands you are using for:

a) to create the index
b) to do mapping

ref= directive only needs to include path up to a top level directory. In that directory there should be a "ref" directory (and additional sub-directories, assuming your index creation worked right).

 03-05-2015, 09:25 AM #29 everestial Explanation: I have my F1 reads in F: (due to space concern) I have my bbmap extracted in G: I have my concatenated index reads in G:\bbmap named "indexreads.fq" I run cmd as admin and then move to G: (which I think shouldn't really matter) To prepare the index file I type G:\>java -Xmx1g -ea -cp G:\bbmap\current\ align2.BBMap ref=G:\bbmap\indexreads.fq build=1 genScaffoldInfo=true Message: Executing align2.BBMap [ref=, build=1, genScaffoldInfo=true] BBMap version 34.56 Retaining first best site only for ambiguous mappings. No output file. Writing reference. Executing dna.FastaToChromArrays2 [G:\bbmap\indexreads.fq, 1, writeinthread=false, genscaffoldinfo=true, retain, waitforwriting=false, gz=true, maxlen=536670912, writechroms=true, minscaf=1, midpad=300, startpad=8000, stopped=8000, nodisk=false] Set genScaffoldInfo=true Writing chunk 1 Set genome to 1 Loaded Reference: 4.224 seconds Loading index for chunk 1-1, build 1 No index available; generating from reference genome: G:\\ref\index\1\chr1_index_k13_c3_b1.block Indexing threads started for block 0-1 Exception in thread "Thread-4" java.lang.OutOfMemoryError: Java heap space at align2.IndexMaker4$BlockMaker$CountThread.run To map the genome: G:\>java -Xmx1g -e -cp G:\bbmap\current\ align2.BBMap in=F:\F1_extracted\R1_001.fastq out=F:\mapped_test01.sam maxindel=100000 xstag=firststrand intronlen=10 ambig=random The program executes but I ge the same exact message at the bottom starting from Loaded Reference (except for time).
03-05-2015, 09:34 AM   #30
GenoMax
Senior Member

Location: East Coast USA

Join Date: Feb 2008
Posts: 6,975

Quote:
 Originally Posted by everestial G:\>java -Xmx1g -ea -cp G:\bbmap\current\ align2.BBMap ref=G:\bbmap\indexreads.fq build=1 genScaffoldInfo=true Message: Executing align2.BBMap [ref=, build=1, genScaffoldInfo=true] BBMap version 34.56 Retaining first best site only for ambiguous mappings. No output file. Writing reference. Executing dna.FastaToChromArrays2 [G:\bbmap\indexreads.fq, 1, writeinthread=false, genscaffoldinfo=true, retain, waitforwriting=false, gz=true, maxlen=536670912, writechroms=true, minscaf=1, midpad=300, startpad=8000, stopped=8000, nodisk=false] Set genScaffoldInfo=true Writing chunk 1 Set genome to 1 Loaded Reference: 4.224 seconds Loading index for chunk 1-1, build 1 No index available; generating from reference genome: G:\\ref\index\1\chr1_index_k13_c3_b1.block Indexing threads started for block 0-1 Exception in thread "Thread-4" java.lang.OutOfMemoryError: Java heap space at align2.IndexMaker4$BlockMaker$CountThread.run
You appear to be running out of memory. Increase -Xmx1g to -Xmx4g in the first command. Until this completes without errors do not run the second command.

 03-05-2015, 09:38 AM #31 GenoMax Put your reference sequence in a new directory and then make the index. Code: G:\>java -Xmx4g -ea -cp G:\bbmap\current\ align2.BBMap ref=G:\MAKE_NEW\indexreads.fq build=1 genScaffoldInfo=true
03-05-2015, 09:47 AM   #32
everestial
Member

Location: North Carolina

Join Date: Feb 2015
Posts: 31

Quote:
 Originally Posted by GenoMax You appear to be running out of memory. Increase -Xmx1g to -Xmx4g in the first command. Until this completes without errors do not run the second command. How big is your reference?
The reference is 200 mb file.

03-05-2015, 09:48 AM   #33
everestial
Member

Location: North Carolina

Join Date: Feb 2015
Posts: 31

Quote:
 Originally Posted by GenoMax Put your reference sequence in a new directory and then make the index. Code: G:\>java -Xmx4g -ea -cp G:\bbmap\current\ align2.BBMap ref=G:\MAKE_NEW\indexreads.fq build=1 genScaffoldInfo=true
My ram is 4gb so I am going to use 3gb instead and if it fails I will run with 4gb. I will let you know what happens.

Thanks,

03-05-2015, 09:49 AM   #34
Brian Bushnell
Super Moderator

Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707

Quote:
 Originally Posted by everestial Also, to allow splice junction (which I would expect for RNAseq reads) what should I do. I have been reading the readme file. I understand the concept but for some reason whole coding process still doesn't make very good sense to me. I was able to do fastqc quality check on iplant. Is there any parameter in BBMap equivalent to this quality check app. Thanks,
The default settings of BBMap work fine for plant introns. As for quality... the bhist, qhist, qahist, aqhist, mhist, and ehist flags will allow you to print histograms indicating read quality. You can plot them in Excel or whatever; but BBMap does not directly provide graphical output like Fastqc.

As for your error messages, as GenoMax noted, you are running out of memory. You can determine how to set the -Xmx flag by running AssemblyStats like this:

Code:
java -Xmx1g -ea -cp G:\bbmap\current\ jgi.AssemblyStats2 in=reference.fa k=13
At the bottom, it will tell you how much memory is needed.

Also, neither BBMap nor AssemblyStats support a reference in fastq format (.fq), it has to be in fasta (.fa) format. I'm not entirely sure what "indexreads.fq" is, but the "ref=" argument needs to point to the fasta file of the genome.

Edit:

A 200Mbp genome should run fine with 2g, but not with 1g.

Last edited by Brian Bushnell; 03-05-2015 at 09:52 AM.

03-05-2015, 09:54 AM   #35
everestial
Member

Location: North Carolina

Join Date: Feb 2015
Posts: 31

Quote:
 Originally Posted by everestial My ram is 4gb so I am going to use 3gb instead and if it fails I will run with 4gb. I will let you know what happens. Thanks,
on Xmx3g and 4g
I get the message that the space could not be reserved.
Then I se it to Xmx1500m
still same message.

 03-05-2015, 09:55 AM #36 GenoMax Are you using 32-bit windows on this machine (and 32-bit java)?
03-05-2015, 10:03 AM   #37
everestial
Member

Location: North Carolina

Join Date: Feb 2015
Posts: 31

Quote:
 Originally Posted by Brian Bushnell The default settings of BBMap work fine for plant introns. As for quality... the bhist, qhist, qahist, aqhist, mhist, and ehist flags will allow you to print histograms indicating read quality. You can plot them in Excel or whatever; but BBMap does not directly provide graphical output like Fastqc. As for your error messages, as GenoMax noted, you are running out of memory. You can determine how to set the -Xmx flag by running AssemblyStats like this: Code: java -Xmx1g -ea -cp G:\bbmap\current\ jgi.AssemblyStats2 in=reference.fa k=13 At the bottom, it will tell you how much memory is needed. Also, neither BBMap nor AssemblyStats support a reference in fastq format (.fq), it has to be in fasta (.fa) format. I'm not entirely sure what "indexreads.fq" is, but the "ref=" argument needs to point to the fasta file of the genome. Edit: A 200Mbp genome should run fine with 2g, but not with 1g.
I am going to run the command incorporating all you advise. Hope it will work.
Thanks,

03-05-2015, 10:28 AM   #38
everestial
Member

Location: North Carolina

Join Date: Feb 2015
Posts: 31

Quote:
 Originally Posted by everestial I am going to run the command incorporating all you advise. Hope it will work. Thanks,
Alrite, overall it seems like a memory issue. I corrected the files to fa (fasta) format by new concatenation. RAM of my computer is 4gb but I cannot allocate above 1400mb of the available memorey. I get the erorr message: could not reserve enough space for 1536000KB object heap.

Asseblystats isn't helping either.

Last edited by everestial; 03-05-2015 at 10:33 AM.

 03-05-2015, 10:31 AM #39 Brian Bushnell Sounds like you have a 32-bit version of Windows and/or Java, which may be enough in this case, we'll see. What was the output of AssemblyStats?
03-05-2015, 10:32 AM   #40
everestial
Member

Location: North Carolina

Join Date: Feb 2015
Posts: 31

Quote:
 Originally Posted by everestial Alrite, overall it seems like a memory issue. I corrected the files to fa (fasta) format by new concatenation. RAM of my computer is 4gb but I cannot allocate above 1400mb of the available memorey. I get the erorr message: could not reserve enough space for 1536000KB object heap. Asseblystats isn't helping either.
Well, AssemblyStats worked now:
BBMap minimum memory estimate at k=13: -Xmx2770m <at least 3080 MB physical RAM>

My RAM is 4gb. What would you suggest in this case? Is there a way to boost up the memory usage?

