Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • starting smrtpipe with .cmp.h5

    Dear PacBio Gods,
    I'd like to know how (or if it is possible) to start a smrtpipe job with .cmp.h5 instead of .bas.h5 files? Say for instance I sucessfully ran a realigning job with smrtpipe which generates a couple of files of which a .cmp.5. Then I'd like to run the RS_Modification_and_Motif_Analysis.1.xml protocol, but since .cmp.h5 have already been generated I thought it would be nice to start directly with the .cmp.h5.

    Also I know I could just add the proper Motif detection prototols in my realigning xml file, so everything could be run in one step, but for some reasons I want to keep them separate.
    Cheers,

  • #2
    Unfortunately cmp.h5 files cannot be used as input to SMRT Pipe. It is possible to run the Modification and Motif analysis manually:
    Code:
    ipdSummary.py -h
    Code:
    motifMaker.sh
    But the full SMRT Pipe generated report will be missing. If you wan to run the complete workflow I would suggest running one complete job starting from the bas.h5 through SMRT Pipe then checking out the workflow graph:
    <job dir>/workflow/Workflow.summary.html to see what exactly is ran, you can then find the actual command lines:
    <job_dir>/workflow/P_ModificationDetection
    <job_dir>/workflow/P_MotifFinder

    Comment


    • #3
      Thanks for the suggestion rhall, that's what I ended up doing - and I'm now running each steps separately based on the log files. Cheers,

      Comment


      • #4
        Dear pacbio God Almighty,
        Please, let me continue this post..... I'm trying to run MotifMaker; in the begining looks good:
        Code:
        (smrtanalysis-2.3.0) [resources]$ motifMaker.sh 
        Usage: MotifMaker [options] [command] [command options]
          Options:
            -h, --help   
                         Default: false
          Commands:
        ....
        but when I place any inputs:

        Code:
        motifMaker.sh find -f Geobacter_metallireducens_gDNA.fasta -g modifications.gff.gz -o output.csv
        I get:

        Code:
        Exception in thread "main" java.lang.reflect.InvocationTargetException
        	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        	at java.lang.reflect.Method.invoke(Method.java:601)
        	at com.simontuffs.onejar.Boot.run(Boot.java:340)
        	at com.simontuffs.onejar.Boot.main(Boot.java:166)
        Caused by: java.util.NoSuchElementException: key not found: ref000001
        	at scala.collection.MapLike$class.default(MapLike.scala:225)
        	at scala.collection.immutable.Map$Map1.default(Map.scala:107)
        	at scala.collection.MapLike$class.apply(MapLike.scala:135)
        	at scala.collection.immutable.Map$Map1.apply(Map.scala:107)
        	at com.pacbio.basemods.Reader$.parseLine$1(MotifMixture.scala:263)
        	at com.pacbio.basemods.Reader$$anonfun$loadModificationsFromGff$1.apply(MotifMixture.scala:292)
        	at com.pacbio.basemods.Reader$$anonfun$loadModificationsFromGff$1.apply(MotifMixture.scala:292)
        	at scala.collection.Iterator$$anon$19.next(Iterator.scala:401)
        	at scala.collection.Iterator$$anon$19.next(Iterator.scala:401)
        	at scala.collection.TraversableOnce$FlattenOps$$anon$1.hasNext(TraversableOnce.scala:391)
        	at scala.collection.Iterator$$anon$22.hasNext(Iterator.scala:457)
        	at scala.collection.Iterator$class.foreach(Iterator.scala:772)
        	at scala.collection.Iterator$$anon$22.foreach(Iterator.scala:451)
        	at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
        	at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:102)
        	at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:250)
        	at scala.collection.Iterator$$anon$22.toBuffer(Iterator.scala:451)
        	at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:237)
        	at scala.collection.Iterator$$anon$22.toArray(Iterator.scala:451)
        	at com.pacbio.basemods.Program$.loadModificationsGff(Program.scala:602)
        	at com.pacbio.basemods.Program$.runMotifReport(Program.scala:796)
        	at com.pacbio.basemods.Program$.main(Program.scala:929)
        	at com.pacbio.basemods.Program.main(Program.scala)
        	... 6 more
        Any idea what I'm doing wrong will be appreciated. Thanks

        PD. I'll sacrifice my bacteria strains in your honor

        Comment


        • #5
          Looks like a mismatch between the fasta header names used in the gff file and those in the fasta file. Is the fasta file from the imported reference used to generate the gff?

          Comment


          • #6
            It doesn't look like that. I picked the files from motifMaker-master
            GitHub is where people build software. More than 100 million people use GitHub to discover, fork, and contribute to over 420 million projects.

            to use as test
            Code:
            motifMaker.sh -f src/test/resources/Geobacter_metallireducens_gDNA.fasta -g src/test/resources/modifications.gff.gz -o src/test/resources/output.csv
            the head in the files:
            Geobacter_metallireducens_gDNA.fasta
            Code:
            >ref000001|G.metallireducens_gDNA
            modifications.gff.gz
            Code:
            ##gff-version 3
            ##source kineticModificationDetector 0.1
            ##source-commandline /mnt/secondary/Smrtanalysis/opt/smrtanalysis/analysis/bin/ipdSummary.py --summary_h5 /mnt/secondary/Smrtanalysis/opt/smrtanalysis/common/jobs/040/040017/data/temp_kinetics.h5 --gff /scratch/tmp6fBZys.gff --csv /scratch/tmpzLX0jG.csv --reference /mnt/secondary/Smrtanalysis/opt/smrtanalysis/common/references/Geobacter_metallireducens_gDNA /mnt/secondary/Smrtanalysis/opt/smrtanalysis/common/jobs/040/040017/data/aligned_reads.cmp.h5
            ##sequence-header ref000001 G.metallireducens_gDNA
            ref000001	kinModCall	modified_base	42	42	36	-	.	IPDRatio=1.89;context=CTTGACAGACAATGGTTGCTGTGATTAAAGATACTCTCTTT;coverage=106
            Took out
            Code:
            |G.metallireducens_gDNA
            but still with exceptions, though a little different this time:
            Code:
            smrtanalysis-2.3.0) [frodriguez@minnie MotifMaker-master]$ motifMaker.sh -f src/test/resources/Geobacter_metallireducens_gDNA.fasta -g src/test/resources/modifications.gff.gz -o src/test/resources/output.csv 
            Exception in thread "main" java.lang.reflect.InvocationTargetException
            	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
            	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            	at java.lang.reflect.Method.invoke(Method.java:601)
            	at com.simontuffs.onejar.Boot.run(Boot.java:340)
            	at com.simontuffs.onejar.Boot.main(Boot.java:166)
            Caused by: com.beust.jcommander.ParameterException: Unknown option: -f
            	at com.beust.jcommander.JCommander.parseValues(JCommander.java:723)
            	at com.beust.jcommander.JCommander.parse(JCommander.java:275)
            	at com.beust.jcommander.JCommander.parse(JCommander.java:258)
            	at com.pacbio.basemods.Program$.main(Program.scala:915)
            	at com.pacbio.basemods.Program.main(Program.scala)
            	... 6 more

            Comment


            • #7
              Looks like an error related to the passing of variables to the .sh rather than running the jar directly. Try:
              Code:
              motifMaker.sh find --gff modifications.gff.gz --fasta reference.fasta --output motif_summary.csv

              Comment


              • #8
                Yep, it works! Thanks

                Comment


                • #9
                  Sorry, but one more question regarding MotifMaker analysis. So it's been created to identify motifs associated with DNA modification in prokaryotic genomes, though it's not clear (for me) if the algorithm use any motif database (prokaryote) in order to pull them out.
                  Let me rephrase the question: would it be appropriate to use MotifMaker with eukaryote genomes?
                  Thanks

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM
                  • seqadmin
                    Techniques and Challenges in Conservation Genomics
                    by seqadmin



                    The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                    Avian Conservation
                    Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                    03-08-2024, 10:41 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Yesterday, 06:37 PM
                  0 responses
                  8 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, Yesterday, 06:07 PM
                  0 responses
                  8 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-22-2024, 10:03 AM
                  0 responses
                  49 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-21-2024, 07:32 AM
                  0 responses
                  66 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X