SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Pacific Biosciences



Similar Threads
Thread Thread Starter Forum Replies Last Post
HGAP with fastQ reads boetsie Pacific Biosciences 20 12-17-2013 06:37 AM
imprting Raw reads into smrt Portal coldturkey Pacific Biosciences 38 12-04-2013 12:04 PM
HGAP Parameters in SMRT Analysis bonifera Pacific Biosciences 7 10-24-2013 01:10 PM
HGAP assembly coldturkey Pacific Biosciences 2 04-30-2013 07:23 AM

Reply
 
Thread Tools
Old 12-20-2013, 10:06 AM   #1
pag
Member
 
Location: CA, USA

Join Date: May 2012
Posts: 72
Default Diary: Assembly in SMRT Portal 2.1.1 with HGAP+CA 8.1

Description of genome and assembly desires
SMRT Portal 2.1.1 was released within the last couple of months (packaged with Celera Assembler 7); the Celera Assembler has, in the last week, released version 8.1. This will serve as a log of my successes and failures at installing and using these two packages together to assemble a single-chromosome bacterial (circular) genome with data from a 5K PB library sequenced with a single SMRT cell; a 1/4 plate 454 from its own library prep; along with various Sanger-derived sequences. The genome in question is has long repeats (12 loci and 15 copies of one transposable element; 2 copies of another).

System: single laptop running Ubuntu 12.04 x64; 4 CPUs; dash shell


Installation
Main package versions
http://files.pacb.com/software/smrta...untu-12.04.run
http://iweb.dl.sourceforge.net/proje...-amd64.tar.bz2

leading notes: I'm glad the "SEYMOURHOME" convention is being phased out in favor of SMRT_ROOT; Ubuntu doesn't have an /opt directory by default.
I followed the suggestion of adding a smrtanalysis user to my system; this had several complications
  1. my standard user account didn't have access to the dropbox directory, nor other program-related entries. At least initially, the smrtanalysis user has equal administrative permissions as my standard user account, which at least partially defeats the purpose of compartmentalization.
  2. the smrtanalysis user didn't have the same accesses to mysql as my main user
commands:
sudo adduser smrtanalysis #note: this adds the user to the main ubuntu login screen.
sudo apt-get install krb5-user #(unlisted depend - I believe without this you cannot create the administrator user in SMRTportal)
sudo usermod -a -G www-data smrtanalysis #(otherwise it cannot interface with a localized apache server)
sudo usermod -a -G smrtanalysis pag #(so my main user will have access to look at the directories - at least it SHOULD after I reboot)
sudo visudo #or sudo usermod -a G admin smrtanalysis #to add installation privileges to smrtanalysis user
mysql
CREATE USER 'smrtanalysis'@'%'
GRANT CREATE ON *.* to 'smrtanalysis'@'%';
GRANT CREATE ON *.* to 'smrtanalysis'@'localhost';
GRANT ALL PRIVILEGES on *.* to 'smrtanalysis'@'localhost' WITH GRANT OPTION; #some scripts fail without this
GRANT ALL PRIVILEGES on *.* to 'smrtanalysis'@'%' WITH GRANT OPTION;
FLUSH PRIVILEGES;
exit
su smrtanalysis
SMRT_ROOT=/opt/smrtanalysis #as smrtanalysis
sudo mkdir -p $SMRT_ROOT
sudo mkdir -p /tmp/smrtanalysis #without this directory the install script will not run! unsure if sudo needed
sudo chown -R smrtanalysis:smrtanalysis $SMRT_ROOT #otherwise owned by root on my system
sudo chown -R smrtanalysis:smrtanalysis /tmp/smrtanalysis
sudo mv smrtanalysis-2.1.* /opt
cd /opt
bash smrtanalysis-2.1.1-ubuntu-12.04.run --rootdir $SMRT_ROOT # --no-extract #after the first use
(edited some setup scripts to use smrtanalysis or smrtportal user in mysql as well)
cd $SMRT_ROOT
sudo chmod -R ug+rX current/smrtanalysis-2.1.1.128549/ #so my standard user can at least get in (x) and see listings (r) of the subdirectories
sudo chmod ug+x /opt/smrtanalysis/current/etc/setup.sh
cd /opt/smrtanalysis/admin
./tomcatd start #this works under either user
./kodosd start #this would not run under non-smrtanalysis account

#installing current version of wgs...
sudo mv $DOWNLOADPATH/wgs-8.1-Linux-amd64.tar.bz2 $SMRT_HOME/install/smrtanalysis-2.1.1.128549/analysis/bin
cd $SMRT_HOME/install/smrtanalysis-2.1.1.128549/analysis/bin
sudo tar xjf wgs-8.1-Linux-amd64.tar.bz2 #creates wgs-8.1 directory and fills it
mv wgs-7.0/ wgs-7.0.0 #or wgs.old
sudo ln -s wgs wgs-7.0 #points files that depend on wgs7.0 in the path to a generic "wgs" which will be the current version. hopefully pacbio will eventually switch all files to point to a generic/current wgs directory
sudo ln -s wgs-8.1/ wgs #points the generic to the most up-to-date
sudo chown -h smrtanalysis:smrtanalysis wgs* #changes links' ownership
sudo chown -R smrtanalysis:smrtanalysis wgs-8.1/
sudo chmod -R ug=rX,o-rwx wgs-8.1/
(remove the tar.bz2 if desired)
More installation notes:
Documentation on how to auto-launch tomcatd and kodosd at system startup would be appreciated

Creating desktop shortcut to smrtview sometimes produces one with a pretty icon, sometimes a .jnlp file, depending on the method of instantiation. It is unclear what happens if this link ALREADY exists in the target location. I had to override the java version used by this application to use Oracle 7 rather than my system default. As I don't always have an internet connection enabled when launching SMRTView, it is needed to go to the Oracle Java 7 Control Panel; goto advanced and change "Check for Certificates Using" to CRLs. You will probably need an internet connection the first time you use SMRTView. I am unclear how to get the individual jnlp files for a dataset to work when created in SMRTPortal (I believe it tried launching with the wrong java version).

More to follow

Usage review/notes/annoyances and bugs
SMRTView tooltips disappear after a couple of seconds and information is not selectable from these tooltips, nor duplicated elsewhere in the interface. Prologue information displayed for all or most BridgeMapper reads in tooltips appears to be the same as the Epilogue information (split read view for 2-piece subreads). I would prefer subreads to be "expanded" such that a single line in the SMRTView interface is for a single BridgeMapper read (and multiple subread reads be grouped and "boxed"). Currently, the interface is compressed - which means that if the view is zoomed out, multiple reads may be split across a single line, may be overlapping, and overall give a confusing view.

BridgeMapper documentation appears to be, as yet, incomplete. It is definitely a great feature - in fact the primary reason I decided to install SMRT analysis tools (rather than just portions of SMRT pipe) and the latest SMRTView. For example, the split_reads.bridgemapper file is not explained in online docs, nor is much of the information available in SMRTView. Thanks for the link rhall - still not QUITE resolving my questions, but a lot better than the documentation that I previously located.

Bridgemapper questions:
  1. Are the Score columns the BLASR score of mapping the entire subread to the indicated location (contiguously with pro, main, epi) or just with the indicated portion?
  2. What does percent similarity refer to (e.g. for the whole-read mapped to one location, or just that one portion internally; essentially the same question as above)
  3. Is there a way to display how poorly it would map if it were left as one contiguous chunk if that's not the default above? - my genome has a lot of transposable elements, as I mentioned, so slightly better mappings elsewhere may be insignificant.
  4. Does the file have all reads listed? there are some that appear to have neither prologue nor epilogue, so I've filtered those out in loffice
  5. P.S. where can I find a BLASR score equation/description? I have many "negative" scores - is it like golf or freezers where low=good?

Last edited by pag; 12-20-2013 at 06:29 PM.
pag is offline   Reply With Quote
Old 12-20-2013, 10:56 AM   #2
rhall
Senior Member
 
Location: San Francisco

Join Date: Aug 2012
Posts: 322
Default

Bridgemapper spec: https://github.com/PacificBioscience...i/Bridgemapper
I like the idea of a diary, hopefully people can help out with the issues you will undoubtedly face.

Last edited by rhall; 12-20-2013 at 10:58 AM.
rhall is offline   Reply With Quote
Old 12-20-2013, 11:25 AM   #3
pag
Member
 
Location: CA, USA

Join Date: May 2012
Posts: 72
Default

Thanks rhall (richard?),

It was suggested to me during a previous stage of my assembly that I create a diary here on SeqAnswers (probably from the MIRA mailing list?). I didn't do so immediately, but always seem to have 195 issues installing and beginning to use various packages. Unfortunately I'm starting to log things after having already waded through installing SMRTportal, so I might be leaving out some steps I needed. Like you said, maybe this will help someone else, and it's quite likely to help me

Last edited by pag; 12-20-2013 at 06:35 PM.
pag is offline   Reply With Quote
Old 12-20-2013, 06:37 PM   #4
pag
Member
 
Location: CA, USA

Join Date: May 2012
Posts: 72
Default

installation of wgs-8.1 added; pretty straightforward to use the precompiled package. building from source is a bit more painful (if I recall).
pag is offline   Reply With Quote
Old 12-21-2013, 04:25 PM   #5
pag
Member
 
Location: CA, USA

Join Date: May 2012
Posts: 72
Default

unfortunately using a custom spec for hybrid assembly (RS_HGAP_ASSEMBLY.2) is not straightforward. I uploaded a spec file that had definitions of all EXCEPT my PB data, which would be generated earlier in the pipeline. I had assumed that the SMRTanalysis program would append the line for the .frg file (etc) to the tail end of my spec file. I did not see a template .spec file provided in the SMRT pipeline, so did not know what file(s) to specify in my spec.

Do I want common/jobs/xxx/xxxyyy/data/corrected.frg? Are there any other files that are generated that I should be aware of?
pag is offline   Reply With Quote
Old 12-22-2013, 01:43 PM   #6
pag
Member
 
Location: CA, USA

Join Date: May 2012
Posts: 72
Default

INFO] 2013-12-21 18:49:45,188 [smrtpipe.status refreshTargets 419] Starting task://016443/P_CeleraAssembler/runCaHgap
[ERROR] 2013-12-22 13:03:29,981 [smrtpipe.status refreshTargets 413] *** Failed task task://016443/P_CeleraAssembler/runCaHgap
[INFO] 2013-12-22 13:03:30,059 [smrtpipe.status execute 627] task runCaHgap FAILED
[ERROR] 2013-12-22 13:03:31,851 [pbpy.smrtpipe.SmrtPipeMain run 581] SmrtExit task://016443/P_CeleraAssembler/runCaHgap Failed
Traceback (most recent call last):
File "/opt/smrtanalysis/install/smrtanalysis-2.1.1.128549/analysis/lib/python2.7/pbpy-0.1-py2.7.egg/pbpy/smrtpipe/SmrtPipeMain.py", line 544, in run
self._runTasks(pModules)
File "/opt/smrtanalysis/install/smrtanalysis-2.1.1.128549/analysis/lib/python2.7/pbpy-0.1-py2.7.egg/pbpy/smrtpipe/SmrtPipeMain.py", line 276, in _runTasks
workflow.execute()
File "/opt/smrtanalysis/install/smrtanalysis-2.1.1.128549/analysis/lib/python2.7/pbpy-0.1-py2.7.egg/pbpy/smrtpipe/engine/SmrtPipeWorkflow.py", line 635, in execute
raise SmrtExit(str(e))
SmrtExit: SmrtExit task://016443/P_CeleraAssembler/runCaHgap Failed
[ERROR] 2013-12-22 13:03:31,874 [pbpy.smrtpipe.SmrtPipeMain exit 667] SmrtExit task://016443/P_CeleraAssembler/runCaHgap Failed
[INFO] 2013-12-22 13:03:31,875 [pbpy.smrtpipe.SmrtPipeMain exit 669] Beginning exit process
[INFO] 2013-12-22 13:03:31,892 [pbpy.smrtpipe.SmrtPipeMain shutdown 690] beginning Shutdown Process
[INFO] 2013-12-22 13:03:31,893 [pbpy.smrtpipe.SmrtPipeMain _generate_final_report 620] Writing final TOC page with error message : SmrtExit task://016443/P_CeleraAssembler/runCaHgap Failed
[INFO] 2013-12-22 13:03:35,535 [pbpy.smrtpipe.SmrtPipeMain shutdown 725] Completed shutdown gracefully
[CRITICAL] 2013-12-22 13:03:37,710 [smrtpipe.status exit 679] hard exiting smrtpipe v1.79.127046 with returncode -1 in 69009.17 sec (1150.15 min).
-----------------
(tail of) runCaHgap.log

----------------------------------------END CONCURRENT Sun Dec 22 13:03:28 2013 (10619 seconds)
/opt/smrtanalysis/install/smrtanalysis-2.1.1.128549/common/jobs/016/016443/data/5-consensus/celera-assembler_003 failed -- no .success.
================================================================================

runCA failed.

----------------------------------------
Stack trace:

at /opt/smrtanalysis/install/smrtanalysis-2.1.1.128549/analysis/bin/runCA line 1501
main::caFailure('1 unitig consensus jobs failed; remove /opt/smrtanalysis/inst...', undef) called at /opt/smrtanalysis/install/smrtanalysis-2.1.1.128549/analysis/bin/runCA line 4859
main:ostUnitiggerConsensus() called at /opt/smrtanalysis/install/smrtanalysis-2.1.1.128549/analysis/bin/runCA line 6315

----------------------------------------
Failure message:

1 unitig consensus jobs failed; remove /opt/smrtanalysis/install/smrtanalysis-2.1.1.128549/common/jobs/016/016443/data/5-consensus/consensus.sh to try again
---
runCA-log most recent log file
CA version CA 8.1 ($Id: utgcnsfix.C 4442 2013-10-04 14:33:50Z brianwalenz $).

Error Rates:
AS_OVL_ERROR_RATE 0.250000
AS_CNS_ERROR_RATE 0.250000
AS_CGW_ERROR_RATE 0.250000
AS_MAX_ERROR_RATE 0.400000

Current Working Directory:
/tmp/smrtanalysis/tmpD06np5

Command:
/opt/smrtanalysis/install/smrtanalysis-2.1.1.128549/analysis/bin/wgs-8.1/Linux-amd64/bin/utgcnsfix \
-g /opt/smrtanalysis/install/smrtanalysis-2.1.1.128549/common/jobs/016/016443/data/celera-assembler.gkpStore \
-t /opt/smrtanalysis/install/smrtanalysis-2.1.1.128549/common/jobs/016/016443/data/celera-assembler.tigStore 2 004 \
-o /opt/smrtanalysis/install/smrtanalysis-2.1.1.128549/common/jobs/016/016443/data/5-consensus/celera-assembler_004.fixes

(tail of) celera-assembler_003.cns.err
Working on unitig 15047 (0 unitigs and 1293 fragments)
unitig 15047 detected 1283 contains (55.91x, 94.36%) 10 dovetail (3.34x, 5.64%)
Working on unitig 15048 (0 unitigs and 443 fragments)
unitig 15048 detected 441 contains (38.85x, 96.12%) 2 dovetail (1.57x, 3.88%)
utgcns: MultiAlignUnitig.C:517: int unitigConsensus::computePositionFromParent(bool): Assertion `cnspos[tiid].bgn < cnspos[tiid].end' failed.

Failed with 'Aborted'

Backtrace (mangled):

/opt/smrtanalysis/install/smrtanalysis-2.1.1.128549/analysis/bin/wgs-8.1/Linux-amd64/bin/utgcns(_Z17AS_UTL_catchCrashiP7siginfoPv+0x27)[0x41e057]
/lib/x86_64-linux-gnu/libpthread.so.0(+0xfcb0)[0x7fb69b8e3cb0]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x35)[0x7fb69b54a425]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x17b)[0x7fb69b54db8b]
/lib/x86_64-linux-gnu/libc.so.6(+0x2f0ee)[0x7fb69b5430ee]
/lib/x86_64-linux-gnu/libc.so.6(+0x2f192)[0x7fb69b543192]
/opt/smrtanalysis/install/smrtanalysis-2.1.1.128549/analysis/bin/wgs-8.1/Linux-amd64/bin/utgcns[0x43b5b6]
/opt/smrtanalysis/install/smrtanalysis-2.1.1.128549/analysis/bin/wgs-8.1/Linux-amd64/bin/utgcns(_Z16MultiAlignUnitigP11MultiAlignTP7gkStoreP11CNS_OptionsPi+0xf8)[0x43e368]
/opt/smrtanalysis/install/smrtanalysis-2.1.1.128549/analysis/bin/wgs-8.1/Linux-amd64/bin/utgcns(main+0x9c2)[0x40c942]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xed)[0x7fb69b53576d]
/opt/smrtanalysis/install/smrtanalysis-2.1.1.128549/analysis/bin/wgs-8.1/Linux-amd64/bin/utgcns[0x40a6b9]

Backtrace (demangled):

[0] /opt/smrtanalysis/install/smrtanalysis-2.1.1.128549/analysis/bin/wgs-8.1/Linux-amd64/bin/utgcns::AS_UTL_catchCrash(int, siginfo*, void*) + 0x27 [0x41e057]
[1] /lib/x86_64-linux-gnu/libpthread.so.0:null) + 0xfcb0 [0x7fb69b8e3cb0]
[2] /lib/x86_64-linux-gnu/libc.so.6:null) + 0x35 [0x7fb69b54a425]
[3] /lib/x86_64-linux-gnu/libc.so.6:null) + 0x17b [0x7fb69b54db8b]
[4] /lib/x86_64-linux-gnu/libc.so.6:null) + 0x2f0ee [0x7fb69b5430ee]
[5] /lib/x86_64-linux-gnu/libc.so.6:null) + 0x2f192 [0x7fb69b543192]
[6] /opt/smrtanalysis/install/smrtanalysis-2.1.1.128549/analysis/bin/wgs-8.1/Linux-amd64/bin/utgcns() [0x43b5b6]
[7] /opt/smrtanalysis/install/smrtanalysis-2.1.1.128549/analysis/bin/wgs-8.1/Linux-amd64/bin/utgcns::MultiAlignUnitig(MultiAlignT*, gkStore*, CNS_Options*, int*) + 0xf8 [0x43e368]
[8] /opt/smrtanalysis/install/smrtanalysis-2.1.1.128549/analysis/bin/wgs-8.1/Linux-amd64/bin/utgcns:null) + 0x9c2 [0x40c942]
[9] /lib/x86_64-linux-gnu/libc.so.6:null) + 0xed [0x7fb69b53576d]
[10] /opt/smrtanalysis/install/smrtanalysis-2.1.1.128549/analysis/bin/wgs-8.1/Linux-amd64/bin/utgcns() [0x40a6b9]

GDB:


Aborted (core dumped)

Last edited by pag; 12-23-2013 at 09:28 AM.
pag is offline   Reply With Quote
Old 12-23-2013, 12:13 AM   #7
mjhsieh
Member
 
Location: USA

Join Date: Jan 2013
Posts: 10
Default

for troubleshooting purpose, try look into common/job/016/016443/log/P_CeleraAssembler/ for more log files.
mjhsieh is offline   Reply With Quote
Old 12-23-2013, 08:41 AM   #8
pag
Member
 
Location: CA, USA

Join Date: May 2012
Posts: 72
Default

I was able to run it with the default settings, clearing the custom spec file field.
Then, I looked into the 016444 directory (most recent run), pulled out the spec file, changed the line that read sgeName = 016444 to sgeName = 016445 (as that would be the next one run) and appended a line to also use
454full.frg, which are my 454 reads (from fastqToCA) and resubmitted the job with custom spec file.

This ran to completion without error/failure (so there was probably something wrong with my old CA spec file). However, there is no evidence of 454 data in SMRT View. I'll check logs to see if the 454full.frg file was not readable by the smrtanalysis user...
pag is offline   Reply With Quote
Old 12-23-2013, 09:04 AM   #9
pag
Member
 
Location: CA, USA

Join Date: May 2012
Posts: 72
Default

it looks like it didn't keep the setting for the custom spec file itself. I had specified a path in the jobs_dropbox directory, but it seems that the SMRTPortal doesn't like that, and clears the form. Copying the spec file to /home/smrtanalysis allows me to use this path. The smrtanalysis user has access to see (in the shell) the frg file, as does my primary user.
pag is offline   Reply With Quote
Old 12-23-2013, 09:31 AM   #10
pag
Member
 
Location: CA, USA

Join Date: May 2012
Posts: 72
Default

Quote:
Originally Posted by mjhsieh View Post
for troubleshooting purpose, try look into common/job/016/016443/log/P_CeleraAssembler/ for more log files.

Edited post above to contain more specific errors/info found in log files. However, using the SMRTportal/analysis default spec file ran without error. As indicated in other posts, I'm still experiencing other issues. Currently attempting to re-run.
pag is offline   Reply With Quote
Old 01-02-2014, 10:48 AM   #11
pag
Member
 
Location: CA, USA

Join Date: May 2012
Posts: 72
Default

based on assembly time, it seems that it used the 454 data, but SMRTView doesn't appear to display non-PB data. Is there a recommended viewer for the mapping of all data (PB+454 in this case). My Sanger data was excluded from this assembly.

I'm currently attempting to use bridgemapper manually on the assembly generated, as the RS_HGAP_ASSEMBLY.2 generated multiple contigs, but did not automatically process them through BM. The import of the "reference" appears to be taking quite a while, in relation to the single contig consensus that I used for my non-assembly mapping

Last edited by pag; 01-02-2014 at 10:52 AM.
pag is offline   Reply With Quote
Old 01-02-2014, 12:24 PM   #12
pag
Member
 
Location: CA, USA

Join Date: May 2012
Posts: 72
Default

currently hitting a problem with displaying the BridgeMapper job that I did Dec 18th. I don't know if this is an issue with the permissions of the smrtanalysis user, my standard user, or something else.

I cannot seem to display both the bridgemapper data and the actual base information of the reads. If I file-open data from server and select my old BM run (jobs/016/016437) and zoom in to the sequence data level (e.g. 50bp window) it displays just dots for the sequence data (.........) and doesn't highlight base quality info (or indel/SNPs). This is true whether I have "display bridgemapper only" selected or not in my preferences. QV annotations are on as well.

I cannot seem to browse my whole file system for files to load into SMRTView. If I create a symlink eg
ln -s /data/genome/quiver/ /opt/smrtanalysis/userdata/quiver_link
I can load the non-BM mapping info (cmp6.cmp.h5 plus reference.xml) and zooming in I can see the bases. If I try to overlay the BM track via server navigation (jobs/016/016437/data/split_reads.bridgemapper.gz) and toggle view bridgemapper reads only (either on or off) I can no longer get base info when I'm zoomed in. If I try to overlay the BM track via Add Tracks (rather than via server) I cannot select the gz'd split_reads file. I can select alignment_summary.gff, but attempting to add that track results in a "permission denied" error.

tomcat/kodos launched with
su -l -c '/opt/smrtanalysis/current/admin/bin/tomcatd start' smrtanalysis
su -l -c '/opt/smrtanalysis/current/admin/bin/kodosd start' smrtanalysis

running as
ps -U smrtanalysis -u smrtanalysis u
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
1001 16170 0.8 3.5 10713572 283644 pts/0 Sl 11:16 1:11 /opt/smrtanalysis/install/smrtanalysis-2.1.1.128549/redist/java/bin/java -Djava
1001 16223 0.0 0.0 17008 392 ? Ss 11:17 0:00 jsvc.exec -Xms256m -Xmx256m -user smrtanalysis -pidfile /tmp/kodosd.pid -home /
1001 16224 0.2 1.5 2202672 126740 ? Sl 11:17 0:17 jsvc.exec -Xms256m -Xmx256m -user smrtanalysis -pidfile /tmp/kodosd.pid -home /

(tomcat doesn't appear to be there?)

Being zoomed in completely and toggling "show bridgemapper reads only" (either on OR off) briefly displays sequence info, but this is rapidly overlaid by ...... for the reads. If I go to file->remove tracks it tells me that there are no tracks to remove.

From "top"
Cpu(s): 2.4%us, 1.0%sy, 0.1%ni, 96.4%id, 0.2%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 7970724k total, 5835592k used, 2135132k free, 350828k buffers
Swap: 43745276k total, 2256k used, 43743020k free, 1751964k cached

Last edited by pag; 01-02-2014 at 12:56 PM.
pag is offline   Reply With Quote
Old 01-02-2014, 02:30 PM   #13
rhall
Senior Member
 
Location: San Francisco

Join Date: Aug 2012
Posts: 322
Default

Trying to catch up with the thread after being offline for the Holidays.
Last things first, adding the 454 data to the CA step will not include the 454 data in any of the other SMRT Pipe tasks. The 454 data cannot be used in resequencing, or BriodgeMapper, via SMRT Analysis and is therefore not view-able in SMRT View. You can map all the data, PacBio and 454, to the reference assembly using blasr with the -sam option to generate a sam file that can be viewed in tablet, or IGV.
BridgeMapper is not a part of the HGAP pipeline, so does have to be ran separately after uploading the assembly as a reference. How many contigs does the assembly have? Reference upload can be slow for files with many contigs.
5-consensus errors are generally memory problems, this is the most memory hungry step, and as such errors are not always reproducible.
Bridgemapper questions:
The portions are aligned independently, the score and identity are over only the aligned fragment.
Quote:
Is there a way to display how poorly it would map if it were left as one contiguous chunk if that's not the default above? - my genome has a lot of transposable elements, as I mentioned, so slightly better mappings elsewhere may be insignificant.
Blasr does not force an alignment of the entire read, if BridgeMapper is not being ran only the main alignment would be generated / reported.
Quote:
Does the file have all reads listed? there are some that appear to have neither prologue nor epilogue, so I've filtered those out in loffice
Yes in these cases either all the read is being accounted for by a single alignment, or the portions of the reads that are not in the alignment (epilogue / prologue) do not map elsewhere in the reference.

Blasr reference: http://www.biomedcentral.com/1471-2105/13/238/abstract
rhall is offline   Reply With Quote
Old 01-03-2014, 10:18 AM   #14
pag
Member
 
Location: CA, USA

Join Date: May 2012
Posts: 72
Default

a few minor, minor issues: the linux_configure script in smartview-2.1.0/bin does not functionally run a second time. e.g. @HTTPPORT has already been replaced with the value of $S_PORT from the first run, so cannot be updated to a new value. Having a server_xml.default file that gets copied to server.xml at the start of the script would be a way of addressing this. There was also an issue with attempting to write to a file of the same name as a directory within log (@smrt_home/common/log/smrtview) on reinstallation but this may have something to do with conflicting tomcat installations (see below)

There is also no text indicating what should be done if you disagree with the configuration provided running linux_configure. It says to hit control-C, but not what line in what file should be edited (as it does not have prompts within the script to replace values). It's a bash script, but without a .bash or .sh extension, so it's not immediately obvious that it's a script rather than a binary.

Also, you may have this in the documentation somewhere already, but you cannot (easily) run a tomcat instance for both smrtportal and the stand-alone smrtview - smrtanalysis start will overwrite/override the active tomcat instance in memory.
pag is offline   Reply With Quote
Old 01-03-2014, 10:36 AM   #15
pag
Member
 
Location: CA, USA

Join Date: May 2012
Posts: 72
Default

Quote:
Originally Posted by rhall View Post
Trying to catch up with the thread after being offline for the Holidays.
I understand completely. Thanks for getting back to me at all!

Quote:
Last things first, adding the 454 data to the CA step will not include the 454 data in any of the other SMRT Pipe tasks. The 454 data cannot be used in resequencing, or BriodgeMapper, via SMRT Analysis and is therefore not view-able in SMRT View. You can map all the data, PacBio and 454, to the reference assembly using blasr with the -sam option to generate a sam file that can be viewed in tablet, or IGV.
Thanks for the BLASR tip. I'll probably look into that soon.

Quote:
BridgeMapper is not a part of the HGAP pipeline, so does have to be ran separately after uploading the assembly as a reference. How many contigs does the assembly have? Reference upload can be slow for files with many contigs.
It looks like it generated 47 contigs. The reference uploader job does not appear in the web-based jobs monitor - should I look for a particular process in top or ps? I did several restarts of kodosd and tomcatd when I was attempting to troubleshoot other things which likely resulted in a termination of the process.

Quote:
5-consensus errors are generally memory problems, this is the most memory hungry step, and as such errors are not always reproducible.
Bridgemapper questions:
The portions are aligned independently, the score and identity are over only the aligned fragment.
Thanks
Quote:
Blasr does not force an alignment of the entire read, if BridgeMapper is not being ran only the main alignment would be generated / reported.
So (to summarize/confirm this as well as the help file) the score only comes from the positive-contributing region in a non-BM read, and in BM - non-mapping regions from the first BLASR pass are submitted only if they are over a certain threshhold in length.
Quote:
Yes in these cases either all the read is being accounted for by a single alignment, or the portions of the reads that are not in the alignment (epilogue / prologue) do not map elsewhere in the reference.
okay. I had not considered where the epilogue and prologue do not map elsewhere. This might indicate that my assembly (the contigs I'm mapping to) is incomplete or that these particular reads were low-quality?
thanks again for everything!
pag is offline   Reply With Quote
Old 01-03-2014, 10:55 AM   #16
rhall
Senior Member
 
Location: San Francisco

Join Date: Aug 2012
Posts: 322
Default

Quote:
The reference uploader job does not appear in the web-based jobs monitor
It will not appear as a job in Portal, but is a job in the sense that it has a directory in the jobs folder. Finding the job folder for the upload (the jobs in portal will skip a number) and checking the logs should tell you why the upload failed.
When mapping back to an assembly it is always worth considering the unmapped data. Epilogue and prologue sequence that is of significant length and does not map could be the result of the HQ region of the trace not being very accurate, and will likely occur in overloaded data. If a ZMW initially has two polymerases loaded the initial part of the trace can include cross talk from the two sequences, it is possible for one of the polymerases to stop sequencing and the trace will start to produce good sequence, if this point is not found accurately then a read can have 'junk' none mapping sequence in the epilogue or prologue. While you should be aware of this it is generally not a problem, and is accounted for in the HGAP assembly process. The bigger issue when aligning back to an assembly is the % of none mapping subreads, this can give an indication if the assembly accounts for all the underlying data. You should be looking for a mapping % of ~80-90% if this value is much lower then I would take a look at the unmapped_subreads file, and see if the unmapped data can be accounted for by the degenerage contigs, from CA, that are not included in the resequencing.
rhall is offline   Reply With Quote
Old 01-03-2014, 11:08 AM   #17
pag
Member
 
Location: CA, USA

Join Date: May 2012
Posts: 72
Default

looks like some of my problems were partially due to the overzealousness of Ubuntu's garbage cleaning.

[DEBUG] 2014-01-03 11:37:34,125 [pbpy.smrtpipe.SmrtPipeContext createTemporaryDir 593] Attempting to create temp directory with dir prefix '/opt/smrtanalysis/tmpdir'
[ERROR] 2014-01-03 11:37:34,137 [pbpy.smrtpipe.SmrtPipeMain run 581] [Errno 2] No such file or directory: '/opt/smrtanalysis/tmpdir/tmpDpLq3C'

/opt/smrtanalysis/tmpdir -> /tmp/smrtanalysis (broken link)

The smrtanalysis directory is wiped out upon reboot of my computer, as the entire contents of /tmp are cleaned out (I cannot recall if this is during system shutdown or startup). Perhaps the start script (smrtportal's kodosd + smrtview's smrtanalysis?) should recreate this directory? For now, I'll create an /opt/tmp which will be a temporary directory that will be more persistent

Edit:
as smrtanalysis
sudo chmod 777 /opt/tmp
sudo chmod o+t /opt/tmp
mkdir /opt/tmp/smrtanalysis
chmod o+w /opt/tmp/smrtanalysis/
sudo ln -s /opt/tmp/ /opt/smrtanalysis/tmpdir
sudo chown -h smrtanalysis:smrtanalysis /opt/smrtanalysis/tmpdir
ls -la /
Quote:
...
drwxrwxrwt 22 www-data www-data 12288 Jan 3 12:17 tmp
...
ls -la /opt/
Quote:
drwxrwxrwt 2 root root 4096 Jan 3 12:09 tmp
ls -la /opt/tmp
Quote:
drwxrwxrwx 2 smrtanalysis smrtanalysis 4096 Jan 3 12:14 smrtanalysis
ls -la /opt/smrtanalysis
Quote:
...
lrwxrwxrwx 1 smrtanalysis smrtanalysis 9 Jan 3 12:24 tmpdir -> /opt/tmp/
...
my issues with not being able to see individual read data with bridgemapper data loaded are still persisting; however, reference uploader now works and I've started a BM job with apparent success

Last edited by pag; 01-05-2014 at 07:53 AM.
pag is offline   Reply With Quote
Old 01-06-2014, 04:13 PM   #18
pag
Member
 
Location: CA, USA

Join Date: May 2012
Posts: 72
Default

5 screenshots of the bug - had to tar/gz them and do 2 per file in order to meet file size limits here.
Attached Files
File Type: gz tarOfBug1of3.tar.gz (322.7 KB, 1 views)
File Type: gz tarOfBug2of3.tar.gz (361.7 KB, 1 views)
File Type: gz tarOfBug3of3.tar.gz (145.5 KB, 1 views)
pag is offline   Reply With Quote
Old 01-29-2014, 05:44 PM   #19
pag
Member
 
Location: CA, USA

Join Date: May 2012
Posts: 72
Default

Wishlist

SMRTView
  • Go to: keep zoom level (or add "jump to")
    I have a coordinate that I want to go to directly with centered view. I can copy-paste into the details panel my coord[sup]*[/sup] twice as both start and finish (and click the arrow icon, not "close" or enter). This centers the view at the coord. However, this defaults to a view that is quite zoomed out. To get to maximum zoom-in, I have to click the "A" magnifier and then click the + magnifier a few times. It would be nice if there was a way to either keep the zoom level when jumping to a single coordinate or add two more zoom icons - maximum zoom and minimum/no zoom. If there was a "jump to" button that just had a single coord for input (in addition to the current "go to" option) that would be great.
  • View insertions from allow panning or show more positions (with insertions) at once
    I'd like to see indels on BOTH sides of a homopolymer run often (to see if my scaffold/consensus sequence is correct) or sometimes I would like to see indels that are in close proximity to others. Initially a cmp.h5 file involved in a quiver-based (SMRT pipeline 2.0.1) comparison to my 454-generated sequences would place additional bases congruent (agreeing) with the homopolymer run on a single side or the other (may have had to do with read direction). Using a bridgemapper (SMRT portal 2.1.1) generated comparison, some congruent bases may even be "inserted" into the middle of a homopolymer run (without other inserted bases of a non-congruent type) or be present on both "ends" of the homopolymer.
  • Make GUI obey system file manager style
    Currently uses MSwindows-positioned minimize/maximize/restore/close buttons (right side), rather than my system style (left side). Does not populate global (file, etc) menu
  • Use better-contrasting colors for insertions and substitutions.
    T against the insertion background or A against the substitution background can render the base unreadable depending on the quality score of the site in question and/or your monitor's contrast level. Use of a black or white text color for currently same-colored bases or the use of a novel color for the background (e.g. cyan, pink, purple) instead of green/blue/red/yellow may be warranted.

* oocalc/msoffice copy of a cell copies also a carriage return. This means that control-V doesn't quite work visually within the box. A simple middle click doesn't work, but that's office's issue (as neither highlighting a cell nor control-C copy to the xclipboard). Control-V seems to function properly though. I have a shortcut on my system to paste minus the CR using control+middle click. I believe this was achieved with the xbindkeys package with xvkbd and xclip.
.xbindkeysrc file:
Quote:
"sh -c 'xclip -o -selection clipboard | xvkbd -xsendevent -file -'"
control + b:2
This is a non-critical issue related to SMRTView - it IS a problem in consed, where control-V doesn't work

Last edited by pag; 02-05-2014 at 11:25 AM.
pag is offline   Reply With Quote
Old 02-05-2014, 11:46 AM   #20
pag
Member
 
Location: CA, USA

Join Date: May 2012
Posts: 72
Default

I added more things above to the features requested list

Bugs (or other unexpected things)
SMRTView
  • Change of color for base quality annotation type
    Under Variants annotation any change of color results in the box registering as gray (rather than the selected color). Clicking on a second annotation type color swatch does not allow you to select and modify that color (without going to another tab first). On the QV annotations tab, the qualities get screwed up in the sense that the 50 quality level appears to take the color from the 3 (or 0?) quality. Prior to selection of a different color high quality is on a "light" background and low quality on a "dark" background, after change of a color on Variants tab HQ is "dark", moderate is "mid tone" and LQ is also "dark" for all variant types (not just the one specifically altered). Additionally, the LQ color is not altered from the previous value (so replacing the green with cyan still retains green for LQ). These issues seem to be fixed on a settings save and then a reload of the application (the new color selection is used appropriately).
  • Impossible to unload a bridgemapper track
    Loading the Bridgemapper track makes the view of the individual sequence reads impossible and precludes restoring the non-bridgemapper view by unloading the track. It is necessary to either exit the application or at the very least re "load data from server".
  • Sometimes data does not load and display
    This seems to often occur for the first attempt to load something in SMRTView - I don't know if I'm launching too soon after tomcatd and kodosd start or what, but the status bar suggests that all data is loaded, but no sequences show in the details window and you cannot select a subrange in the "region" window. Also the close/minimize/restore buttons are missing from the window (and I don't believe the menus work either). I typically have to close from the system-wide window manager and relaunch.
  • A right-click on the consensus sequence in the details panel with lateral mouse movement causes the sequence to scroll unexpectedly
    This often accidentally jumps my sequence "window" when the lateral mouse movement was unintentional and slight. I had intended to see the context menu allowing me to see insertions at the position.
  • At various times the "Go to location" dialogue doesn't pop up when the button is pushed
    I can't seem to repeat the circumstances reliably, but I think this occurs when the dialogue is already open and the application is minimized. This may be related to my "sloppy focus" mouse settings (see http://askubuntu.com/questions/64605...-follows-mouse; settings: "focus_mode" sloppy, "focus_new_windows" smart, "raise_on_click" checked, "auto_raise" not checked; note also "button_layout" value "close,minimize,maximize:" re: one of my feature suggestions)
Forum moderators: feel free to split the last two posts here to a new topic "PacBio SMRTView Bugs and Feature Suggestions" (I cannot find the option myself)

Last edited by pag; 02-14-2014 at 12:12 PM.
pag is offline   Reply With Quote
Reply

Tags
celera assembler, pacbio, smrtview

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:21 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO