Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    The quality scores are phred-style. Insertion QVs give the likelihood that the given base is itself an insertion. Deletion QVs refer to the immediately preceding base, and the most likely deleted base is stored as the DeletionTag in the HDF file. In PacBio data, insertions are the most common error. The QV in the FASTQ is predominated by the insertion QV.

    Comment


    • #17
      Replying to SillyPoint-

      We were interested in employing the pacBioToCA script, but the pacBioToCA pipeline expects PacBio RS sequences in fastq format (with sanger (PHRED32) quality values). We were only given an assembled.fasta and a filtered_subreads.fasta.

      Comment


      • #18
        I am the developer of pacBioToCA, happy to see interest in the pipeline. If you have only fasta files, the pacBioToCA wiki page includes a section on inputting PacBio RS sequences: http://sourceforge.net/apps/mediawik...o_RS_Sequences

        We provide a java utility to convert the fasta data to fastq with uniform quality values (http://www.cbcb.umd.edu/~sergek/PacB...ToFastq.tar.gz). The instructions for using it are at the above link.

        Comment


        • #19
          Thanks for the information, sergek. I will try and take a look at this when I get to the lab in the morning.

          Comment


          • #20
            Thanks for all your replies. I want to understand the SMRT pipe for running assembly. I understand the BLASR has to be run to align the longreads and CCS reads. And then make a consensus through the make-consensus from amos. However the input of the amos make-consensus is the TIG file. How do we go about genrating from the BLASR output? Please somebody help me on the pipeline.

            Comment


            • #21
              Installing SMRTanalysis

              Hi All,

              We are trying to install SMRTanalysis software from pacbio. We are getting error as below:

              File "./smrtpipe.py", line 4, in <module>
              import pkg_resources
              File "/usr/local/lib64/python2.6/site-packages/distribute-0.6.24-py2.6.egg/pkg_resources.py", line 2707, in <module>
              working_set.require(__requires__)
              File "/usr/local/lib64/python2.6/site-packages/distribute-0.6.24-py2.6.egg/pkg_resources.py", line 686, in require
              needed = self.resolve(parse_requirements(requirements))
              File "/usr/local/lib64/python2.6/site-packages/distribute-0.6.24-py2.6.egg/pkg_resources.py", line 584, in resolve
              raise DistributionNotFound(req)
              pkg_resources.DistributionNotFound: pbpy==0.1

              Seems like something wrong with python. We updated to python version 2.7 and included correct path in .bashrc files. Still get the same error.

              Any suggestions from previous experience?

              Thanks

              Comment


              • #22
                I think SMRTanalysis comes bundled with its own python. The error above seems to be referring to a system python2.6 directory though, so are you using the system python?

                For clarity you should have started a new thread instead of posting under this one.

                Comment


                • #23
                  Originally posted by GenoMax View Post
                  I think SMRTanalysis comes bundled with its own python. The error above seems to be referring to a system python2.6 directory though, so are you using the system python?

                  For clarity you should have started a new thread instead of posting under this one.
                  Hi,

                  Thanks for the quick reply. I have created new thread here:
                  Single-molecule real-time observation of DNA polymerase using zero-mode waveguide (ZMW) optical confinement nanostructures


                  and updated query with more details. Please reply.

                  Thanks

                  Comment


                  • #24
                    Hello

                    I don't understand the error message generated (DeNovo assembly using pacBio data)
                    with the sample data (e.coli and lambda) is ok.

                    Then run the command 'smrtpipe.py --params=settings.xml xml:input.xml &>smrtpipe.err' and got an error log message as below:

                    INFO] 2013-02-20 15:57:08,552 [pbpy.smrtpipe.SmrtDataService writeTo 424] Writing 6 items to DataStore in {'smrt.data.xmlparam': <pbpy.io.MetaAnalysisXml.InputDataUrl object at 0x4295a50>, 'smrt.output.log': '/sto4data-2/zebu4/data/06022013_smrtpipe/teste1_gir/log', 'smrt.data.cmdline': <pbpy.smrtpipe.InputData.CompositeInputData object at 0x42959d0>, 'smrt.output.root': '/sto4data-2/zebu4/data/06022013_smrtpipe/teste1_gir', 'smrt.output.results': '/sto4data-2/zebu4/data/06022013_smrtpipe/teste1_gir/results', 'smrt.output.data': '/sto4data-2/zebu4/data/06022013_smrtpipe/teste1_gir/data'}
                    [INFO] 2013-02-20 15:57:08,555 [pbpy.smrtpipe.SmrtPipeMain _runTasks 267] Skipping PreWorkflow as it contains zero tasks
                    [INFO] 2013-02-20 15:57:08,558 [pbpy.smrtpipe.SmrtPipeMain _runTasks 270] Loading 10 tasks into Workflow
                    [INFO] 2013-02-20 15:57:09,275 [pbpy.smrtpipe.SmrtPipeMain _runTasks 279] Executing workflow Workflow
                    [INFO] 2013-02-20 15:57:09,649 [pbpy.smrtpipe.engine.SmrtPipeTasks run 622] Running task://Anonymous/P_Fetch/toFofn
                    [ERROR] 2013-02-20 15:57:14,702 [pbpy.smrtpipe.SmrtPipeMain run 648] time data 'Qua Fev 20 15:57:09 CST 2013' does not match format '%a %b %d %H:%M:%S %Z %Y' Traceback (most recent call last):
                    File "/opt/smrtanalysis-1.4.0/analysis/lib/python2.7/pbpy-0.1-py2.7.egg/pbpy/smrtpipe/SmrtPipeMain.py", line 608, in run self._runTasks(pModules)
                    File "/opt/smrtanalysis-1.4.0/analysis/lib/python2.7/pbpy-0.1-py2.7.egg/pbpy/smrtpipe/SmrtPipeMain.py", line 281, in _runTasks workflow.execute()
                    File "/opt/smrtanalysis-1.4.0/analysis/lib/python2.7/pbpy-0.1-py2.7.egg/pbpy/smrtpipe/engine/SmrtPipeWorkflow.py", line 607, in execute self._update(0)
                    File "/opt/smrtanalysis-1.4.0/analysis/lib/python2.7/pbpy-0.1-py2.7.egg/pbpy/smrtpipe/engine/SmrtPipeWorkflow.py", line 574, in _update self._writeWorkflow()
                    File "/opt/smrtanalysis-1.4.0/analysis/lib/python2.7/pbpy-0.1-py2.7.egg/pbpy/smrtpipe/engine/SmrtPipeWorkflow.py", line 554, in _writeWorkflow self._graph.toFile(path, format)
                    File "/opt/smrtanalysis-1.4.0/analysis/lib/python2.7/pbpy-0.1-py2.7.egg/pbpy/smrtpipe/engine/SmrtDAG.py", line 258, in toFile out.write(format2func[format](self))
                    File "/opt/smrtanalysis-1.4.0/analysis/lib/python2.7/pbpy-0.1-py2.7.egg/pbpy/smrtpipe/engine/SmrtDAG.py", line 255, in <lambda> 'RDF': lambda g: g.toRDF().serialize(),
                    File "/opt/smrtanalysis-1.4.0/analysis/lib/python2.7/pbpy-0.1-py2.7.egg/pbpy/smrtpipe/engine/SmrtDAG.py", line 208, in toRDF for s, p, o in node.toRDF():
                    File "/opt/smrtanalysis-1.4.0/analysis/lib/python2.7/pbpy-0.1-py2.7.egg/pbpy/smrtpipe/engine/SmrtDAG.py", line 81, in toRDF Literal(str(self.obj.computeTime))))
                    File "/opt/smrtanalysis-1.4.0/analysis/lib/python2.7/pbpy-0.1-py2.7.egg/pbpy/smrtpipe/engine/SmrtPipeTasks.py", line 834, in computeTime self._extractComputeTime(regexp)
                    File "/opt/smrtanalysis-1.4.0/analysis/lib/python2.7/pbpy-0.1-py2.7.egg/pbpy/smrtpipe/engine/SmrtPipeTasks.py", line 821, in _extractComputeTime self._cachedExecTimes[regexp] = datetime.datetime.strptime(match.group(1), LOG_TIME_FORMAT)
                    File "/opt/smrtanalysis-1.4.0/redist/python2.7/lib/python2.7/_strptime.py", line 325, in _strptime (data_string, format))
                    ValueError: time data 'Qua Fev 20 15:57:09 CST 2013' does not match format '%a %b %d %H:%M:%S %Z %Y'
                    [ERROR] 2013-02-20 15:57:14,704 [pbpy.smrtpipe.SmrtPipeMain exit 760] time data 'Qua Fev 20 15:57:09 CST 2013' does not match format '%a %b %d %H:%M:%S %Z %Y'


                    I need help =)

                    Comment


                    • #25
                      juassis,
                      Unfortunately it is a bug due to system location, for a fix, add the following two lines to $SEYMOUR_HOME/etc/setup.sh:
                      Code:
                      export LANG=en_US.UTF-8
                      export LANG=en_US.UTF-8
                      Which sets some environment variables used by python. PacBio is aware of the bug and it should be fixed in the next release.

                      Comment


                      • #26
                        Sorry, second line is:
                        Code:
                        export LC_ALL=en_US.UTF-8

                        Comment


                        • #27
                          Hello!
                          Thanks for the information!
                          Worked properly! =)

                          Just one more question,
                          worked properly in the first analysis, however, when presented new data from another bred came up again this error. I'll have to fix every time I want to examine?

                          Comment


                          • #28
                            If the lines are added to the $SEYMOUR_HOME/etc/setup.sh file then SMRT pipe should function correctly after the setup.sh file is sourced.

                            Comment


                            • #29
                              Hello,
                              thank you very much for your help and comments. It was possible to correct several samples
                              Again some problems. I was able to run the smrtpipe.py command without any errors. However when I tried to run again the SMRTpipe the error message appears:

                              . /opt/smrtanalysis/etc/setup.sh
                              $ smrtpipe.py --params=gir_params.xml xml:gir_input.xml

                              Bus error (core dumped)

                              --
                              I did the memory test, and everything is ok.

                              ulimit -a
                              core file size (blocks, -c) 0
                              data seg size (kbytes, -d) unlimited
                              scheduling priority (-e) 0
                              file size (blocks, -f) unlimited
                              pending signals (-i) 4133745
                              max locked memory (kbytes, -l) 64
                              max memory size (kbytes, -m) unlimited
                              open files (-n) 1024
                              pipe size (512 bytes, -p) 8
                              POSIX message queues (bytes, -q) 819200
                              real-time priority (-r) 0
                              stack size (kbytes, -s) 10240
                              cpu time (seconds, -t) unlimited
                              max user processes (-u) 1024
                              virtual memory (kbytes, -v) unlimited
                              file locks (-x) unlimited


                              Filesystem Size Used Avail Use% Mounted on
                              /dev/sdg 12T 900G 11T 9% /


                              Many thanks for your help. =)

                              Comment


                              • #30
                                Do you get any output at all? Anything in the ./log/ directory?

                                What distribution are you running on, this could possibly be a result of a mismatch between the system it is running on and the build system, either Ubuntu 10.04, or Centos 5.6.

                                Comment

                                Latest Articles

                                Collapse

                                • seqadmin
                                  Current Approaches to Protein Sequencing
                                  by seqadmin


                                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                                  04-04-2024, 04:25 PM
                                • seqadmin
                                  Strategies for Sequencing Challenging Samples
                                  by seqadmin


                                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                                  03-22-2024, 06:39 AM

                                ad_right_rmr

                                Collapse

                                News

                                Collapse

                                Topics Statistics Last Post
                                Started by seqadmin, 04-11-2024, 12:08 PM
                                0 responses
                                18 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 10:19 PM
                                0 responses
                                22 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-10-2024, 09:21 AM
                                0 responses
                                16 views
                                0 likes
                                Last Post seqadmin  
                                Started by seqadmin, 04-04-2024, 09:00 AM
                                0 responses
                                47 views
                                0 likes
                                Last Post seqadmin  
                                Working...
                                X