Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • SRMA problem with NUM_THREADS

    Hello,

    I've run SRMA without any error on a file, but it last 74 hours. So I would like to use the NUM_THREADS option. But when I tried, I get this mesage :

    [Fri Mar 30 16:08:54 CEST 2012] srma.SRMA INPUT=[blabla.bfast.allBest.sort.bam.onTarget.bam] OUTPUT=[blabla_SRMArealigned_MHS100000_O100_MTC10000_MMQ10.bam] REFERENCE=hg19-ordre-valide.fa OFFSET=100 MIN_MAPQ=10 MAXIMUM_TOTAL_COVERAGE=10000 MAX_HEAP_SIZE=100000 MAX_QUEUE_SIZE=32768 NUM_THREADS=8 MINIMUM_ALLELE_PROBABILITY=0.1 MINIMUM_ALLELE_COVERAGE=3 CORRECT_BASES=false USE_SEQUENCE_QUALITIES=true QUIET_STDERR=false GRAPH_PRUNING=false TMP_DIR=/tmp/olivia VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
    ** Warning: option NUM_THREADS currently may not increase performance significantly. **
    ** Try running multiple processes with RANGE if the speed does not increase. **
    Allele coverage cutoffs:
    coverage: 1 minimum allele coverage: 0
    coverage: 2 minimum allele coverage: 0
    coverage: 3 minimum allele coverage: 0
    coverage: 4 minimum allele coverage: 1
    coverage: 5 minimum allele coverage: 1
    coverage: 6 minimum allele coverage: 1
    coverage: 7 minimum allele coverage: 2
    coverage: 8 minimum allele coverage: 2
    coverage: 9 minimum allele coverage: 3
    coverage: >9 minimum allele coverage: 3
    Records processsed: 1151528 (last chr1:115260225-115260274)java.lang.Exception: SAMRecord contig does not match the current reference sequence contig
    at srma.Graph.addSAMRecord(Graph.java:54)
    at srma.SRMA$GraphThread.run(SRMA.java:708)

    As it worked well without the NUM_THREADS option on the same file, I wonder if there are some modifications to do on the file before running SRMA with the NUM_THREADS option; or if there is an explanation of this message?

  • #2
    See the warning message for a better solution.

    Comment


    • #3
      Thank you for answering so soon.
      I've seen the warning, but my problem is not about the time now. I would like to understantd why it doesn't work any more when I had an option.

      I made another test. I ran SRMA with other options and get again an error:

      [Mon Apr 02 16:37:57 CEST 2012] srma.SRMA INPUT=[blabla.bfast.allBest.sort.bam.onTarget.bam] OUTPUT=[blabla_SRMArealigned_MHS100000_O100_MTC10000_MMQ10.bam] REFERENCE=hg19-ordre-valide.fa OFFSET=100 MIN_MAPQ=10 MAXIMUM_TOTAL_COVERAGE=10000 MAX_HEAP_SIZE=100000 MAX_QUEUE_SIZE=32768 MINIMUM_ALLELE_PROBABILITY=0.1 MINIMUM_ALLELE_COVERAGE=3 CORRECT_BASES=false USE_SEQUENCE_QUALITIES=true QUIET_STDERR=false GRAPH_PRUNING=false NUM_THREADS=1 TMP_DIR=/tmp/olivia VERBOSITY=INFO QUIET=false VALIDATION_STRINGENCY=STRICT COMPRESSION_LEVEL=5 MAX_RECORDS_IN_RAM=500000 CREATE_INDEX=false CREATE_MD5_FILE=false
      Allele coverage cutoffs:
      coverage: 1 minimum allele coverage: 0
      coverage: 2 minimum allele coverage: 0
      coverage: 3 minimum allele coverage: 0
      coverage: 4 minimum allele coverage: 1
      coverage: 5 minimum allele coverage: 1
      coverage: 6 minimum allele coverage: 1
      coverage: 7 minimum allele coverage: 2
      coverage: 8 minimum allele coverage: 2
      coverage: 9 minimum allele coverage: 3
      coverage: >9 minimum allele coverage: 3
      Records processsed: 1151528 (last chr1:115260225-115260274)java.lang.Exception: SAMRecord contig does not match the current reference sequence contig
      at srma.Graph.addSAMRecord(Graph.java:54)
      at srma.SRMA$GraphThread.run(SRMA.java:708)
      Please report bugs to [email protected]

      What bother me is why does it behave so differently with the same input file? I would understand some differences in time, but I don't get why it finds that "SAMRecord contig does not match the current reference sequence contig" when the input file and the reference file stay the same.

      I will run it another time with RANGE to see if it solve this problem.
      Last edited by oliviajm; 04-02-2012, 11:26 PM.

      Comment


      • #4
        Hello again,

        I noticed that I get an error every time that I have run several SRMA on the same input file at the same time but with different options and different output files.
        Is it possible that the error come from the fact that several SRMA works on the same input file at the same time? I thought it should not be a problem because it just reads the input file, and does not modify it, but now I'm wondering if the error can be related.
        Last edited by oliviajm; 04-10-2012, 11:48 PM.

        Comment


        • #5
          What version are you using? Can you give me a small test case (just a few SAM records) that reproduces the error? Can you try just running it on one chromosome at a time?

          Comment


          • #6
            I'm using srma-0.1.15.jar.

            You can download a file containing the chr1 lines of the SAM file I'm using there: http://dl.free.fr/vcBPNsKpn
            SRMA crashed at this level on my last try (Records processsed: 1152262 (last chr1:115260225-115260274)).

            I have tried to run SRMA on one chromosome at a time, and it worked. But I find this way to do more complicated,and it needs more steps, and so more time.
            Thanks for spending time on this issue.

            Comment


            • #7
              Why not write a perl or shell / grep script to divide your file up into chromosomes and run SRMA on each ?
              I don't know how easy it is to recombine output though.

              Comment


              • #8
                Hi,

                I have run some others tests, with and without the option MINIMUM_ALLELE_PROBABILITY=1. And I'm not sure of what it does.

                When the minimum allele probability value is 1 instead of the default value of 0.1, does that mean that I will consider more bases because it will include those for which the probability is less than 1? Or the MINIMUM_ALLELE_PROBABILITY has to be seen like a threshold, and when I increased this threshold, less bases will be considered?

                Comment


                • #9
                  Basically this is trying to determine if the coverage is X, what is the minimum # of times you have to see the variant allele. See the AlleleCoverageCutoffs class for the exact computation. Also check out the "minimum edge probability" in the paper: http://dx.doi.org/10.1186/gb-2010-11-10-r99.

                  Comment

                  Latest Articles

                  Collapse

                  • seqadmin
                    Advancing Precision Medicine for Rare Diseases in Children
                    by seqadmin




                    Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
                    12-16-2024, 07:57 AM
                  • seqadmin
                    Recent Advances in Sequencing Technologies
                    by seqadmin



                    Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

                    Long-Read Sequencing
                    Long-read sequencing has seen remarkable advancements,...
                    12-02-2024, 01:49 PM

                  ad_right_rmr

                  Collapse

                  News

                  Collapse

                  Topics Statistics Last Post
                  Started by seqadmin, 12-17-2024, 10:28 AM
                  0 responses
                  22 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 12-13-2024, 08:24 AM
                  0 responses
                  42 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 12-12-2024, 07:41 AM
                  0 responses
                  28 views
                  0 likes
                  Last Post seqadmin  
                  Started by seqadmin, 12-11-2024, 07:45 AM
                  0 responses
                  42 views
                  0 likes
                  Last Post seqadmin  
                  Working...
                  X