Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SRMA problem with NUM_THREADS

    Hello,

    I've run SRMA without any error on a file, but it last 74 hours. So I would like to use the NUM_THREADS option. But when I tried, I get this mesage :

    [Fri Mar 30 16:08:54 CEST 2012] srma.SRMA INPUT=[blabla.bfast.allBest.sort.bam.onTarget.bam] OUTPUT=[blabla_SRMArealigned_MHS100000_O100_MTC10000_MMQ10.bam] REFERENCE=hg19-ordre-valide.fa OFFSET=100 MIN_MAPQ=10 MAXIMUM_TOTAL_COVERAGE=10000 MAX_HEAP_SIZE=100000 MAX_QUEUE_SIZE=32768 NUM_THREADS=8 MINIMUM_ALLELE_PROBABILITY=0.1 MINIMUM_ALLELE_COVERAGE=3 CORRECT_BASES=false USE_SEQUENCE_QUALITIES=true QUIET_STDERR=false GRAPH_PRUNING=false TMP_DIR=/tmp/olivia VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
    ** Warning: option NUM_THREADS currently may not increase performance significantly. **
    ** Try running multiple processes with RANGE if the speed does not increase. **
    Allele coverage cutoffs:
    coverage: 1 minimum allele coverage: 0
    coverage: 2 minimum allele coverage: 0
    coverage: 3 minimum allele coverage: 0
    coverage: 4 minimum allele coverage: 1
    coverage: 5 minimum allele coverage: 1
    coverage: 6 minimum allele coverage: 1
    coverage: 7 minimum allele coverage: 2
    coverage: 8 minimum allele coverage: 2
    coverage: 9 minimum allele coverage: 3
    coverage: >9 minimum allele coverage: 3
    Records processsed: 1151528 (last chr1:115260225-115260274)java.lang.Exception: SAMRecord contig does not match the current reference sequence contig
    at srma.Graph.addSAMRecord(Graph.java:54)
    at srma.SRMA$GraphThread.run(SRMA.java:708)

    As it worked well without the NUM_THREADS option on the same file, I wonder if there are some modifications to do on the file before running SRMA with the NUM_THREADS option; or if there is an explanation of this message?

  • #2
    See the warning message for a better solution.

    Comment


    • #3
      Thank you for answering so soon.
      I've seen the warning, but my problem is not about the time now. I would like to understantd why it doesn't work any more when I had an option.

      I made another test. I ran SRMA with other options and get again an error:

      [Mon Apr 02 16:37:57 CEST 2012] srma.SRMA INPUT=[blabla.bfast.allBest.sort.bam.onTarget.bam] OUTPUT=[blabla_SRMArealigned_MHS100000_O100_MTC10000_MMQ10.bam] REFERENCE=hg19-ordre-valide.fa OFFSET=100 MIN_MAPQ=10 MAXIMUM_TOTAL_COVERAGE=10000 MAX_HEAP_SIZE=100000 MAX_QUEUE_SIZE=32768 MINIMUM_ALLELE_PROBABILITY=0.1 MINIMUM_ALLELE_COVERAGE=3 CORRECT_BASES=false USE_SEQUENCE_QUALITIES=true QUIET_STDERR=false GRAPH_PRUNING=false NUM_THREADS=1 TMP_DIR=/tmp/olivia VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
      Allele coverage cutoffs:
      coverage: 1 minimum allele coverage: 0
      coverage: 2 minimum allele coverage: 0
      coverage: 3 minimum allele coverage: 0
      coverage: 4 minimum allele coverage: 1
      coverage: 5 minimum allele coverage: 1
      coverage: 6 minimum allele coverage: 1
      coverage: 7 minimum allele coverage: 2
      coverage: 8 minimum allele coverage: 2
      coverage: 9 minimum allele coverage: 3
      coverage: >9 minimum allele coverage: 3
      Records processsed: 1151528 (last chr1:115260225-115260274)java.lang.Exception: SAMRecord contig does not match the current reference sequence contig
      at srma.Graph.addSAMRecord(Graph.java:54)
      at srma.SRMA$GraphThread.run(SRMA.java:708)
      Please report bugs to [email protected]

      What bother me is why does it behave so differently with the same input file? I would understand some differences in time, but I don't get why it finds that "SAMRecord contig does not match the current reference sequence contig" when the input file and the reference file stay the same.

      I will run it another time with RANGE to see if it solve this problem.
      Last edited by oliviajm; 04-02-2012, 11:26 PM.

      Comment


      • #4
        Hello again,

        I noticed that I get an error every time that I have run several SRMA on the same input file at the same time but with different options and different output files.
        Is it possible that the error come from the fact that several SRMA works on the same input file at the same time? I thought it should not be a problem because it just reads the input file, and does not modify it, but now I'm wondering if the error can be related.
        Last edited by oliviajm; 04-10-2012, 11:48 PM.

        Comment


        • #5
          What version are you using? Can you give me a small test case (just a few SAM records) that reproduces the error? Can you try just running it on one chromosome at a time?

          Comment


          • #6
            I'm using srma-0.1.15.jar.

            You can download a file containing the chr1 lines of the SAM file I'm using there: http://dl.free.fr/vcBPNsKpn
            SRMA crashed at this level on my last try (Records processsed: 1152262 (last chr1:115260225-115260274)).

            I have tried to run SRMA on one chromosome at a time, and it worked. But I find this way to do more complicated,and it needs more steps, and so more time.
            Thanks for spending time on this issue.

            Comment


            • #7
              Why not write a perl or shell / grep script to divide your file up into chromosomes and run SRMA on each ?
              I don't know how easy it is to recombine output though.

              Comment


              • #8
                Hi,

                I have run some others tests, with and without the option MINIMUM_ALLELE_PROBABILITY=1. And I'm not sure of what it does.

                When the minimum allele probability value is 1 instead of the default value of 0.1, does that mean that I will consider more bases because it will include those for which the probability is less than 1? Or the MINIMUM_ALLELE_PROBABILITY has to be seen like a threshold, and when I increased this threshold, less bases will be considered?

                Comment


                • #9
                  Basically this is trying to determine if the coverage is X, what is the minimum # of times you have to see the variant allele. See the AlleleCoverageCutoffs class for the exact computation. Also check out the "minimum edge probability" in the paper: http://dx.doi.org/10.1186/gb-2010-11-10-r99.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Essential Discoveries and Tools in Epitranscriptomics
                    by seqadmin




                    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                    04-22-2024, 07:01 AM
                  • seqadmin
                    Current Approaches to Protein Sequencing
                    by seqadmin


                    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                    04-04-2024, 04:25 PM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, Yesterday, 11:49 AM
                  0 responses
                  15 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-24-2024, 08:47 AM
                  0 responses
                  16 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-11-2024, 12:08 PM
                  0 responses
                  61 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 04-10-2024, 10:19 PM
                  0 responses
                  60 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X