Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SRMA problem with NUM_THREADS

    Hello,

    I've run SRMA without any error on a file, but it last 74 hours. So I would like to use the NUM_THREADS option. But when I tried, I get this mesage :

    [Fri Mar 30 16:08:54 CEST 2012] srma.SRMA INPUT=[blabla.bfast.allBest.sort.bam.onTarget.bam] OUTPUT=[blabla_SRMArealigned_MHS100000_O100_MTC10000_MMQ10.bam] REFERENCE=hg19-ordre-valide.fa OFFSET=100 MIN_MAPQ=10 MAXIMUM_TOTAL_COVERAGE=10000 MAX_HEAP_SIZE=100000 MAX_QUEUE_SIZE=32768 NUM_THREADS=8 MINIMUM_ALLELE_PROBABILITY=0.1 MINIMUM_ALLELE_COVERAGE=3 CORRECT_BASES=false USE_SEQUENCE_QUALITIES=true QUIET_STDERR=false GRAPH_PRUNING=false TMP_DIR=/tmp/olivia VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
    ** Warning: option NUM_THREADS currently may not increase performance significantly. **
    ** Try running multiple processes with RANGE if the speed does not increase. **
    Allele coverage cutoffs:
    coverage: 1 minimum allele coverage: 0
    coverage: 2 minimum allele coverage: 0
    coverage: 3 minimum allele coverage: 0
    coverage: 4 minimum allele coverage: 1
    coverage: 5 minimum allele coverage: 1
    coverage: 6 minimum allele coverage: 1
    coverage: 7 minimum allele coverage: 2
    coverage: 8 minimum allele coverage: 2
    coverage: 9 minimum allele coverage: 3
    coverage: >9 minimum allele coverage: 3
    Records processsed: 1151528 (last chr1:115260225-115260274)java.lang.Exception: SAMRecord contig does not match the current reference sequence contig
    at srma.Graph.addSAMRecord(Graph.java:54)
    at srma.SRMA$GraphThread.run(SRMA.java:708)

    As it worked well without the NUM_THREADS option on the same file, I wonder if there are some modifications to do on the file before running SRMA with the NUM_THREADS option; or if there is an explanation of this message?

  • #2
    See the warning message for a better solution.

    Comment


    • #3
      Thank you for answering so soon.
      I've seen the warning, but my problem is not about the time now. I would like to understantd why it doesn't work any more when I had an option.

      I made another test. I ran SRMA with other options and get again an error:

      [Mon Apr 02 16:37:57 CEST 2012] srma.SRMA INPUT=[blabla.bfast.allBest.sort.bam.onTarget.bam] OUTPUT=[blabla_SRMArealigned_MHS100000_O100_MTC10000_MMQ10.bam] REFERENCE=hg19-ordre-valide.fa OFFSET=100 MIN_MAPQ=10 MAXIMUM_TOTAL_COVERAGE=10000 MAX_HEAP_SIZE=100000 MAX_QUEUE_SIZE=32768 MINIMUM_ALLELE_PROBABILITY=0.1 MINIMUM_ALLELE_COVERAGE=3 CORRECT_BASES=false USE_SEQUENCE_QUALITIES=true QUIET_STDERR=false GRAPH_PRUNING=false NUM_THREADS=1 TMP_DIR=/tmp/olivia VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
      Allele coverage cutoffs:
      coverage: 1 minimum allele coverage: 0
      coverage: 2 minimum allele coverage: 0
      coverage: 3 minimum allele coverage: 0
      coverage: 4 minimum allele coverage: 1
      coverage: 5 minimum allele coverage: 1
      coverage: 6 minimum allele coverage: 1
      coverage: 7 minimum allele coverage: 2
      coverage: 8 minimum allele coverage: 2
      coverage: 9 minimum allele coverage: 3
      coverage: >9 minimum allele coverage: 3
      Records processsed: 1151528 (last chr1:115260225-115260274)java.lang.Exception: SAMRecord contig does not match the current reference sequence contig
      at srma.Graph.addSAMRecord(Graph.java:54)
      at srma.SRMA$GraphThread.run(SRMA.java:708)
      Please report bugs to [email protected]

      What bother me is why does it behave so differently with the same input file? I would understand some differences in time, but I don't get why it finds that "SAMRecord contig does not match the current reference sequence contig" when the input file and the reference file stay the same.

      I will run it another time with RANGE to see if it solve this problem.
      Last edited by oliviajm; 04-02-2012, 11:26 PM.

      Comment


      • #4
        Hello again,

        I noticed that I get an error every time that I have run several SRMA on the same input file at the same time but with different options and different output files.
        Is it possible that the error come from the fact that several SRMA works on the same input file at the same time? I thought it should not be a problem because it just reads the input file, and does not modify it, but now I'm wondering if the error can be related.
        Last edited by oliviajm; 04-10-2012, 11:48 PM.

        Comment


        • #5
          What version are you using? Can you give me a small test case (just a few SAM records) that reproduces the error? Can you try just running it on one chromosome at a time?

          Comment


          • #6
            I'm using srma-0.1.15.jar.

            You can download a file containing the chr1 lines of the SAM file I'm using there: http://dl.free.fr/vcBPNsKpn
            SRMA crashed at this level on my last try (Records processsed: 1152262 (last chr1:115260225-115260274)).

            I have tried to run SRMA on one chromosome at a time, and it worked. But I find this way to do more complicated,and it needs more steps, and so more time.
            Thanks for spending time on this issue.

            Comment


            • #7
              Why not write a perl or shell / grep script to divide your file up into chromosomes and run SRMA on each ?
              I don't know how easy it is to recombine output though.

              Comment


              • #8
                Hi,

                I have run some others tests, with and without the option MINIMUM_ALLELE_PROBABILITY=1. And I'm not sure of what it does.

                When the minimum allele probability value is 1 instead of the default value of 0.1, does that mean that I will consider more bases because it will include those for which the probability is less than 1? Or the MINIMUM_ALLELE_PROBABILITY has to be seen like a threshold, and when I increased this threshold, less bases will be considered?

                Comment


                • #9
                  Basically this is trying to determine if the coverage is X, what is the minimum # of times you have to see the variant allele. See the AlleleCoverageCutoffs class for the exact computation. Also check out the "minimum edge probability" in the paper: http://dx.doi.org/10.1186/gb-2010-11-10-r99.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Strategies for Sequencing Challenging Samples
                    by seqadmin


                    Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                    03-22-2024, 06:39 AM
                  • seqadmin
                    Techniques and Challenges in Conservation Genomics
                    by seqadmin



                    The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                    Avian Conservation
                    Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                    03-08-2024, 10:41 AM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 03-27-2024, 06:37 PM
                  0 responses
                  12 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-27-2024, 06:07 PM
                  0 responses
                  11 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-22-2024, 10:03 AM
                  0 responses
                  53 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 03-21-2024, 07:32 AM
                  0 responses
                  69 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X