Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Strange CRAMtools 2.1 issue

    Hi all,

    We have accumulated quite a lot of BAM files over the years and we would like (need) to archive the old(er) ones. We have tested CRAMtools 2.1 and we are very happy with the results. Since we are talking about thousands of BAM files, we cannot go without automation. But...

    The problem is as followed:
    - when I run CRAMtools in a job file, it terminates immediatly (throwing an exception).
    - when I copy the command from the job file and execute it on the head node or on the node it previously failed on, I have no issues what so ever.

    My job file looks like this:
    Code:
    #PBS -N DNAXXXXXX
    #PBS -j oe
    #PBS -o /data/results/CRAM_archive/2013/october//DNAXXXXXX.log
    #PBS -r y
    #PBS -q bs_secondary
    #PBS -m ea
    #PBS -M [email protected]
    #PBS -l nodes=1:ppn=12
    cp /data/results/BGI/DNAXXXXXX/DNAXXXXXX.bam /scratch/
    cp /data/results/BGI/DNAXXXXXX/DNAXXXXXX.bam.bai /scratch/
    java -jar /share/apps/cramtools-2.1.jar cram -I /data/results/BGI/DNAXXXXXX/DNAXXXXXX.bam -O /scratch/DNAXXXXXX.cram -R /data/references/GrCh37/GrCh37_reference.fa --preserve-read-names --capture-all-tags -L *8
    java -cp /share/apps/cramtools-2.1.jar net.sf.cram.ValidateCramFile -I /scratch/DNAXXXXXX.cram -R /data/references/GrCh37/GrCh37_reference.fa
    mv /scratch/DNAXXXXXX.cram /data/results/CRAM_archive/2013/october//
    rm -f /scratch/DNAXXXXXX.bam
    rm -f /scratch/DNAXXXXXX.bam.bai
    The exception I get when submitting the job file to torque:
    Code:
    Exception in thread "main" java.lang.reflect.InvocationTargetException
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
            at java.lang.reflect.Method.invoke(Method.java:606)
            at net.sf.cram.CramTools.invoke(CramTools.java:93)
            at net.sf.cram.CramTools.main(CramTools.java:123)
    Caused by: java.lang.RuntimeException: Uknown read or base category: 6
            at net.sf.cram.lossy.QualityScorePreservation.parseSinglePolicy(QualityScorePreservation.java:155)
            at net.sf.cram.lossy.QualityScorePreservation.parsePolicies(QualityScorePreservation.java:79)
            at net.sf.cram.lossy.QualityScorePreservation.<init>(QualityScorePreservation.java:39)
            at net.sf.cram.Bam2Cram.main(Bam2Cram.java:273)
            ... 6 more
    I have tried to include the full path to java to make sure the same version is used in all cases, but the problem remains.

    Does anyone have an idea what could cause this problem and how I can fix it?

    Many thanks,

    Rick

  • #2
    Have you tried just using samtools to make the cram files? That should work reasonably well and allow you to just avoid whatever is causing this-.

    Comment


    • #3
      Thanks for the suggestion! We do like 5,000 diagnostic WES samples (Illumina) each year and, even though we are only archiving BAM files that have expired the mandatory storage term, we apply strict quality measurements to our data. We have validated the use of CRAMtools very well and switching to samtools would probably mean we have to redo the entire validation process. If possible, I would like to prevent this.

      But this problem only occurs with our Illumina data (BWA). When I use the same job file for older BAM files (SOLiD, lifescope), I have no problems at all.

      Comment


      • #4
        Your best bet is to create an issue report on github. This is being thrown here, which appears to be trying to parse something that's saying what to do with quality scores.

        Comment


        • #5
          Many thanks for your suggestion! I have posted the issue on github and we have come to a solution:

          The code below used to work on our main server and it still works on another server when run through a job file. It still works on both servers when running from the command line directly on the node or on the head.

          java -jar /share/apps/cramtools-2.1.jar cram -I /data/results/BGI/DNAXXXXXX/DNAXXXXXX.bam -O /scratch/DNAXXXXXX.cram -R /data/references/GrCh37/GrCh37_reference.fa --preserve-read-names --capture-all-tags -L *8

          It stopped working on our main server when running the command from a job file. The solution was very simple: change -L *8 to -L '*8'

          It seems like this issue is caused by a system/evironment setting. I did have a look at this before, but I wasn't able to pin point why it is working on one machine, but not on the other.
          Last edited by hreuver; 01-26-2015, 08:57 AM.

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Essential Discoveries and Tools in Epitranscriptomics
            by seqadmin


            The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
            Yesterday, 07:01 AM
          • seqadmin
            Current Approaches to Protein Sequencing
            by seqadmin


            Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
            04-04-2024, 04:25 PM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 04-11-2024, 12:08 PM
          0 responses
          54 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 10:19 PM
          0 responses
          50 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-10-2024, 09:21 AM
          0 responses
          44 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 04-04-2024, 09:00 AM
          0 responses
          55 views
          0 likes
          Last Post seqadmin  
          Working...
          X