Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • python error in cuffdiff script

    I used this in command line:

    cuffdiff -o diff_out4 -b ../genome/ce10.fa -p 2 -L larval,early -u merged_asm/merged.gtf ../tophat/em/SRR493359_60_61_thout/accepted_hits.bam ../tophat/em/SRR493363_64_65_thout/accepted_hits.bam

    here SRR493359_60_61 is one group and SRR493363_64_65 is the other group. This worked for me.

    Now I want to run this from a python script. So, I gave a call in this way.

    do.call([cfg.tool_cmd("cuffdiff"), "-p", str(cfg.project["analysis"]["threads"]), "-b", str(cfg.project["genome"]["fasta"]), "-u", cfg.project["samples"][0]["files"]["merging_gtf"], "-L", str(cfg.project["phenotype"][0]), str(cfg.project["phenotype"][1]), "-o", output_folder] + [cfg.project["samples"][0]["files"]["bam"] cfg.project["samples"][1]["files"]["bam"]], cfg.project["analysis"]["log_file"])

    here
    str(cfg.project["phenotype"][0]) is larval
    str(cfg.project["phenotype"][1]) is early

    I get an error: invalid syntax. Can anyone please help in this. Thanks in Advance

  • #2
    It looks to me like you're perhaps missing a comma after cfg.project["samples"][0]["files"]["bam"]. As a general debugging strategy, when you have massive blocks of code like that, it's often easier to break it up into component parts, so those could be debugged individually. So something like:

    Code:
    base_options = [cfg.tool_cmd("cuffdiff"),
                            "-p", str(cfg.project["analysis"]),
                            "etc etc"]
    
    files = [
        cfg.project["samples"][0]["files"]["bam"],
        cfg.project["samples"][1]["files"]["bam"],
        cfg.project["analysis"]["log_file"]
    ]
    
    do.call(base_options + files)

    Comment


    • #3
      Originally posted by rflrob View Post
      It looks to me like you're perhaps missing a comma after cfg.project["samples"][0]["files"]["bam"]. As a general debugging strategy, when you have massive blocks of code like that, it's often easier to break it up into component parts, so those could be debugged individually. So something like:

      Code:
      base_options = [cfg.tool_cmd("cuffdiff"),
                              "-p", str(cfg.project["analysis"]),
                              "etc etc"]
      
      files = [
          cfg.project["samples"][0]["files"]["bam"],
          cfg.project["samples"][1]["files"]["bam"],
          cfg.project["analysis"]["log_file"]
      ]
      
      do.call(base_options + files)
      Thanks for the reply. You may be right in one case. But I want to run this for groups.

      cfg.project["samples"][0]["files"]["bam"] is group1
      cfg.project["samples"][1]["files"]["bam"] is group2

      so it should be group1 vs group2. So, the bam files should be separated with space. And If I run this with space its giving an error. Please tell me how to run this with space between the bam files.

      Comment


      • #4
        I'm not familiar with this "do" module you seem to be using, but I'm pretty sure that passing a list is the equivalent of spaces on the command line. For instance, in the original cuffdiff call, it's not "cuffdiff, -o, diff_out4, ...". So you can see that the syntax in python isn't exactly the same as the command that gets generated. Thus, you really do want that comma there.

        More generally, Python doesn't know how to combine the two variables if there's no comma between them. Depending on the type (which the syntax parser doesn't know in advance), you could want to concatenate them (if strings), multiply them (if they're numbers), or do something else entirely. Recall the Zen of Python: "In the face of ambiguity, refuse the temptation to guess." Since the syntax could be ambiguous, it instead throws a syntax error.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Essential Discoveries and Tools in Epitranscriptomics
          by seqadmin




          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
          04-22-2024, 07:01 AM
        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Today, 08:47 AM
        0 responses
        10 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        60 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        57 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        53 views
        0 likes
        Last Post seqadmin  
        Working...
        X