SEQanswers

Go Back   SEQanswers > General



Similar Threads
Thread Thread Starter Forum Replies Last Post
Trimming 5' end of RNAseq reads for de novo assembly Kennels Bioinformatics 6 02-19-2018 08:50 AM
P1-P2 Adapter/Primer inspirit SOLiD 0 09-23-2013 09:25 AM
Adapter trimming and trimming by quality question alisrpp Bioinformatics 5 04-08-2013 04:55 PM
adapter trimming - help a_mt Bioinformatics 6 11-12-2012 07:36 PM
3' Adapter Trimming caddymob Bioinformatics 0 05-27-2009 12:53 PM

Reply
 
Thread Tools
Old 03-04-2015, 05:32 PM   #21
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Ah - in this case it's capitalization - should be BBMap, not BBmap. I should have caught that the first time.
Brian Bushnell is offline   Reply With Quote
Old 03-04-2015, 06:29 PM   #22
everestial
Member
 
Location: North Carolina

Join Date: Feb 2015
Posts: 31
Default

Thanks Brian.
everestial is offline   Reply With Quote
Old 03-05-2015, 08:04 AM   #23
everestial
Member
 
Location: North Carolina

Join Date: Feb 2015
Posts: 31
Default

Hi Brian,

I was able to index the genome into one file. Looking at the summary it seems that it is chr1, mitochondrial and chlroplast genome. Thats fine for now. I may be able to add sequences from other chromosomes.

For now I tried to map RNAseq reads from one of my sample for the test purpose. I am not worried about splice right now. But, I have been getting errors. I am not sure what the errors mean.
I am posting screen shots.

Also, to allow splice junction (which I would expect for RNAseq reads) what should I do. I have been reading the readme file. I understand the concept but for some reason whole coding process still doesn't make very good sense to me.

I was able to do fastqc quality check on iplant. Is there any parameter in BBMap equivalent to this quality check app.

Thanks,
Attached Images
File Type: png Screenshot 2015-03-05 11.43.45.png (134.7 KB, 3 views)
everestial is offline   Reply With Quote
Old 03-05-2015, 08:08 AM   #24
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,975
Default

Once you have the index files ready you use them with path=C:\bbmap\refreads directive (if that path is right). Do not use ref=.
GenoMax is offline   Reply With Quote
Old 03-05-2015, 08:23 AM   #25
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,975
Default

Quote:
Originally Posted by everestial View Post
Also, to allow splice junction (which I would expect for RNAseq reads) what should I do. I have been reading the readme file. I understand the concept but for some reason whole coding process still doesn't make very good sense to me.
Here is a post to help understand minimum options you could use: http://seqanswers.com/forums/showpos...68&postcount=3

Read through the thread for several additional pointers that Brian has provided for additional questions.
GenoMax is offline   Reply With Quote
Old 03-05-2015, 08:32 AM   #26
everestial
Member
 
Location: North Carolina

Join Date: Feb 2015
Posts: 31
Default

Seems like there is some problem with java. I am not sure.

First, I deleted the previous index file and created a new one. The cmd prompt window on the left. There is an error message at the bottom saying
Exception in thread.........
But, a ref folder is created in G: and there are fasta index files.

So, I proceed with mapping one of my RNAseq sample
As, the reference had already been set in the previous command I just run align2.BBmap with in=samplename.fasta and out=testF1.fasta
But, still getting the same error.
Attached Images
File Type: png Screenshot 2015-03-05 12.29.40.png (78.0 KB, 2 views)
everestial is offline   Reply With Quote
Old 03-05-2015, 08:39 AM   #27
everestial
Member
 
Location: North Carolina

Join Date: Feb 2015
Posts: 31
Default

Quote:
Originally Posted by GenoMax View Post
Here is a post to help understand minimum options you could use: http://seqanswers.com/forums/showpos...68&postcount=3

Read through the thread for several additional pointers that Brian has provided for additional questions.
Well that command for RNAseq reads make a perfect sense. I will follow up with this command and see how my results turn out.

Thanks,
everestial is offline   Reply With Quote
Old 03-05-2015, 08:52 AM   #28
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,975
Default

Quote:
Originally Posted by everestial View Post
Seems like there is some problem with java. I am not sure.

First, I deleted the previous index file and created a new one. The cmd prompt window on the left. There is an error message at the bottom saying
Exception in thread.........
But, a ref folder is created in G: and there are fasta index files.

So, I proceed with mapping one of my RNAseq sample
As, the reference had already been set in the previous command I just run align2.BBmap with in=samplename.fasta and out=testF1.fasta
But, still getting the same error.
You probably want to create a separate folder when you make the index files that way it would be less confusing to use that path in your ref= part.

Can you post the exact commands you are using for:

a) to create the index
b) to do mapping

Screenshots are hard to read.

ref= directive only needs to include path up to a top level directory. In that directory there should be a "ref" directory (and additional sub-directories, assuming your index creation worked right).
GenoMax is offline   Reply With Quote
Old 03-05-2015, 09:25 AM   #29
everestial
Member
 
Location: North Carolina

Join Date: Feb 2015
Posts: 31
Default

Explanation:
I have my F1 reads in F: (due to space concern)
I have my bbmap extracted in G:
I have my concatenated index reads in G:\bbmap named "indexreads.fq"

I run cmd as admin and then move to G: (which I think shouldn't really matter)

To prepare the index file I type
G:\>java -Xmx1g -ea -cp G:\bbmap\current\ align2.BBMap ref=G:\bbmap\indexreads.fq build=1 genScaffoldInfo=true

Message:
Executing align2.BBMap [ref=<path to the indexreads.fq>, build=1, genScaffoldInfo=true]

BBMap version 34.56
Retaining first best site only for ambiguous mappings.
No output file.
Writing reference.
Executing dna.FastaToChromArrays2 [G:\bbmap\indexreads.fq, 1, writeinthread=false, genscaffoldinfo=true, retain, waitforwriting=false, gz=true, maxlen=536670912, writechroms=true, minscaf=1, midpad=300, startpad=8000, stopped=8000, nodisk=false]

Set genScaffoldInfo=true
Writing chunk 1
Set genome to 1

Loaded Reference: 4.224 seconds
Loading index for chunk 1-1, build 1
No index available; generating from reference genome: G:\\ref\index\1\chr1_index_k13_c3_b1.block
Indexing threads started for block 0-1
Exception in thread "Thread-4" java.lang.OutOfMemoryError: Java heap space at align2.IndexMaker4$BlockMaker$CountThread.run<IndexMaker4.java:280>


To map the genome:
G:\>java -Xmx1g -e -cp G:\bbmap\current\ align2.BBMap in=F:\F1_extracted\R1_001.fastq out=F:\mapped_test01.sam maxindel=100000 xstag=firststrand intronlen=10 ambig=random

The program executes but I ge the same exact message at the bottom starting from Loaded Reference (except for time).
everestial is offline   Reply With Quote
Old 03-05-2015, 09:34 AM   #30
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,975
Default

Quote:
Originally Posted by everestial View Post
G:\>java -Xmx1g -ea -cp G:\bbmap\current\ align2.BBMap ref=G:\bbmap\indexreads.fq build=1 genScaffoldInfo=true

Message:
Executing align2.BBMap [ref=<path to the indexreads.fq>, build=1, genScaffoldInfo=true]

BBMap version 34.56
Retaining first best site only for ambiguous mappings.
No output file.
Writing reference.
Executing dna.FastaToChromArrays2 [G:\bbmap\indexreads.fq, 1, writeinthread=false, genscaffoldinfo=true, retain, waitforwriting=false, gz=true, maxlen=536670912, writechroms=true, minscaf=1, midpad=300, startpad=8000, stopped=8000, nodisk=false]

Set genScaffoldInfo=true
Writing chunk 1
Set genome to 1

Loaded Reference: 4.224 seconds
Loading index for chunk 1-1, build 1
No index available; generating from reference genome: G:\\ref\index\1\chr1_index_k13_c3_b1.block
Indexing threads started for block 0-1
Exception in thread "Thread-4" java.lang.OutOfMemoryError: Java heap space at align2.IndexMaker4$BlockMaker$CountThread.run<IndexMaker4.java:280>
You appear to be running out of memory. Increase -Xmx1g to -Xmx4g in the first command. Until this completes without errors do not run the second command.

How big is your reference?
GenoMax is offline   Reply With Quote
Old 03-05-2015, 09:38 AM   #31
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,975
Default

Put your reference sequence in a new directory and then make the index.

Code:
G:\>java -Xmx4g -ea -cp G:\bbmap\current\ align2.BBMap ref=G:\MAKE_NEW\indexreads.fq build=1 genScaffoldInfo=true

Last edited by GenoMax; 03-05-2015 at 09:41 AM.
GenoMax is offline   Reply With Quote
Old 03-05-2015, 09:47 AM   #32
everestial
Member
 
Location: North Carolina

Join Date: Feb 2015
Posts: 31
Default

Quote:
Originally Posted by GenoMax View Post
You appear to be running out of memory. Increase -Xmx1g to -Xmx4g in the first command. Until this completes without errors do not run the second command.

How big is your reference?
The reference is 200 mb file.
everestial is offline   Reply With Quote
Old 03-05-2015, 09:48 AM   #33
everestial
Member
 
Location: North Carolina

Join Date: Feb 2015
Posts: 31
Default

Quote:
Originally Posted by GenoMax View Post
Put your reference sequence in a new directory and then make the index.

Code:
G:\>java -Xmx4g -ea -cp G:\bbmap\current\ align2.BBMap ref=G:\MAKE_NEW\indexreads.fq build=1 genScaffoldInfo=true
My ram is 4gb so I am going to use 3gb instead and if it fails I will run with 4gb. I will let you know what happens.

Thanks,
everestial is offline   Reply With Quote
Old 03-05-2015, 09:49 AM   #34
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Quote:
Originally Posted by everestial View Post
Also, to allow splice junction (which I would expect for RNAseq reads) what should I do. I have been reading the readme file. I understand the concept but for some reason whole coding process still doesn't make very good sense to me.

I was able to do fastqc quality check on iplant. Is there any parameter in BBMap equivalent to this quality check app.

Thanks,
The default settings of BBMap work fine for plant introns. As for quality... the bhist, qhist, qahist, aqhist, mhist, and ehist flags will allow you to print histograms indicating read quality. You can plot them in Excel or whatever; but BBMap does not directly provide graphical output like Fastqc.

As for your error messages, as GenoMax noted, you are running out of memory. You can determine how to set the -Xmx flag by running AssemblyStats like this:

Code:
java -Xmx1g -ea -cp G:\bbmap\current\ jgi.AssemblyStats2 in=reference.fa k=13
At the bottom, it will tell you how much memory is needed.

Also, neither BBMap nor AssemblyStats support a reference in fastq format (.fq), it has to be in fasta (.fa) format. I'm not entirely sure what "indexreads.fq" is, but the "ref=" argument needs to point to the fasta file of the genome.

Edit:

A 200Mbp genome should run fine with 2g, but not with 1g.

Last edited by Brian Bushnell; 03-05-2015 at 09:52 AM.
Brian Bushnell is offline   Reply With Quote
Old 03-05-2015, 09:54 AM   #35
everestial
Member
 
Location: North Carolina

Join Date: Feb 2015
Posts: 31
Default

Quote:
Originally Posted by everestial View Post
My ram is 4gb so I am going to use 3gb instead and if it fails I will run with 4gb. I will let you know what happens.

Thanks,
on Xmx3g and 4g
I get the message that the space could not be reserved.
Then I se it to Xmx1500m
still same message.
everestial is offline   Reply With Quote
Old 03-05-2015, 09:55 AM   #36
GenoMax
Senior Member
 
Location: East Coast USA

Join Date: Feb 2008
Posts: 6,975
Default

Are you using 32-bit windows on this machine (and 32-bit java)?
GenoMax is offline   Reply With Quote
Old 03-05-2015, 10:03 AM   #37
everestial
Member
 
Location: North Carolina

Join Date: Feb 2015
Posts: 31
Default

Quote:
Originally Posted by Brian Bushnell View Post
The default settings of BBMap work fine for plant introns. As for quality... the bhist, qhist, qahist, aqhist, mhist, and ehist flags will allow you to print histograms indicating read quality. You can plot them in Excel or whatever; but BBMap does not directly provide graphical output like Fastqc.

As for your error messages, as GenoMax noted, you are running out of memory. You can determine how to set the -Xmx flag by running AssemblyStats like this:

Code:
java -Xmx1g -ea -cp G:\bbmap\current\ jgi.AssemblyStats2 in=reference.fa k=13
At the bottom, it will tell you how much memory is needed.

Also, neither BBMap nor AssemblyStats support a reference in fastq format (.fq), it has to be in fasta (.fa) format. I'm not entirely sure what "indexreads.fq" is, but the "ref=" argument needs to point to the fasta file of the genome.

Edit:

A 200Mbp genome should run fine with 2g, but not with 1g.
I am going to run the command incorporating all you advise. Hope it will work.
Thanks,
everestial is offline   Reply With Quote
Old 03-05-2015, 10:28 AM   #38
everestial
Member
 
Location: North Carolina

Join Date: Feb 2015
Posts: 31
Default

Quote:
Originally Posted by everestial View Post
I am going to run the command incorporating all you advise. Hope it will work.
Thanks,
Alrite, overall it seems like a memory issue. I corrected the files to fa (fasta) format by new concatenation. RAM of my computer is 4gb but I cannot allocate above 1400mb of the available memorey. I get the erorr message: could not reserve enough space for 1536000KB object heap.

Asseblystats isn't helping either.

Last edited by everestial; 03-05-2015 at 10:33 AM.
everestial is offline   Reply With Quote
Old 03-05-2015, 10:31 AM   #39
Brian Bushnell
Super Moderator
 
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707
Default

Sounds like you have a 32-bit version of Windows and/or Java, which may be enough in this case, we'll see. What was the output of AssemblyStats?
Brian Bushnell is offline   Reply With Quote
Old 03-05-2015, 10:32 AM   #40
everestial
Member
 
Location: North Carolina

Join Date: Feb 2015
Posts: 31
Default

Quote:
Originally Posted by everestial View Post
Alrite, overall it seems like a memory issue. I corrected the files to fa (fasta) format by new concatenation. RAM of my computer is 4gb but I cannot allocate above 1400mb of the available memorey. I get the erorr message: could not reserve enough space for 1536000KB object heap.

Asseblystats isn't helping either.
Well, AssemblyStats worked now:
BBMap minimum memory estimate at k=13: -Xmx2770m <at least 3080 MB physical RAM>

My RAM is 4gb. What would you suggest in this case? Is there a way to boost up the memory usage?
everestial is offline   Reply With Quote
Reply

Tags
adapter contamination, fastqc, rnaseq alignment, rnaseq data

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:02 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO