Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • kga1978
    Senior Member
    • Nov 2010
    • 100

    GATK IndelRealigner error

    Hi All,
    I am trying to perform a local realignment of some BAM generated with Novoalign. I run the following commands:

    Command 1:
    novoalign -f reads.fastq.gz -c 2 -d Mosaik/reference -o SAM 2> reads.novoalign_logS0.txt | samtools view -S -b -q 1 - | samtools sort - reads

    Command 2:
    java -Xmx2g -jar /usr/local/bin/picard/SortSam.jar I=reads.bam O=readssorted.bam SO=coordinate

    Command 3:
    java -Xmx2g -jar /usr/local/bin/gatk/GenomeAnalysisTK.jar -I readssorted.bam -R Mosaik/reference.fasta -T RealignerTargetCreator -o forIndelRealigner.intervals

    Command 4:
    java -Xmx2g -jar /usr/local/bin/gatk/GenomeAnalysisTK.jar -I readssorted.bam -R Mosaik/reference.fasta -T IndelRealigner --targetIntervals forIndelRealigner.intervals -o realignedBam.bam

    The RealignerTargetCreator (3) finishes successfully and creates the required file:

    LASV-reference:52-53
    LASV-reference:305-339
    LASV-reference:439-519

    However, when I run the last command (4) - IndelRealigner, I get the following error:

    ##### ERROR MESSAGE: File associated with name forIndelRealigner.intervals is malformed: Interval file could not be parsed in any supported format. caused by Failed to parse Genome Location string: LASV-reference:52-53

    Any idea what might be the problem here? I have tried various things, but I always fail here.

    My reference.dict looks like this:
    @HD VN:1.0 SO:unsorted
    @SQ SN:LASV-reference LN:3402 UR:file:/Users/kga/Desktop/Mosaik/reference.fasta M5:8a4c76005c28bef3f2775dbf6ffa2062

    reference.fasta.fai:
    LASV-reference 3402 89 3402 3403


    Thanks very much,
    Kristian
    Last edited by kga1978; 11-15-2011, 08:29 PM.
  • Heisman
    Senior Member
    • Dec 2010
    • 534

    #2
    Could you maybe just post the first couple dozen lines of the various files used (or if a bunch of lines look similar just a representative line)?

    Comment

    • kga1978
      Senior Member
      • Nov 2010
      • 100

      #3
      Sure thing:

      Reference:
      >LASV-reference
      GCGCACAGTGGATCCTAGGCATTTTTGGTTGCGCAATTCAAGTGTCCTATTTAAAATGGGACAAATAGTGACATTCTTCCAGGAAGTGCCTCATGTAATAGAAGAGGTGATGAACATTGTTCTCATTGCACTGTCTGTACTAGCAGTGCTGAAAGGTCTGTACAATTTTGCAACGTGTGGCCTTGTTGGTTTGGTCACTTTCCTCCTGTTGTGTGGTAGGTCTTGCACAACCAGTCTTTATAAAGGGGTTTATGAGCTTCAGACTCTGGAACTAAACATGGAGACACTCAATATGACCATGCCTCTCTCCTGCACAAAGAACAACAGTCATCATTATATAATGGTGGGCAATGAGACAGGACTAGAACTGACCTTGACCAACACGAGCATTATTAATCACAAATTTTGCAATCTGTCTGATGCCCACAAAAAGAACCTCTATGACCACGCTCTTATGAGCATAATCTCAACTTTCCACTTGTCCATCCCCAACTTCAATCAGTATGAGGCAATGAGCTGCGATTTTAATGGGGGAAA

      Sorted BAM file:
      ILLUMINA_0142:3:1108:12467:139455#TGACCA/1 0 LASV-reference 2892 20 1S51M * 0 0 GTCTTTGGTCAAGTTGCTGTGAGCTCAAGTTGCCCATATAGACACCTGCACT Z_^cc`ce^aeegedghe_gggcfdhdhhaX^dfghfhhhdhdedg_dfdgh RG:Z:ZGO3HPVJRLW NM:i:2 MD:Z:24T1G24 ZA:Z:<@;0;0;;1;;>
      ILLUMINA_0142:3:1104:8199:92212#TGACCA/1 0 LASV-reference 2893 19 52M * 0 0 CTTTGGTCAAGTTGCTGTGAGCTCAAGTTGCCCATATAGACACCTGCACTCA ^__cc``Yaa^b`beefhehhddf]dfgfhhRabcdbg`fffbcffghhfhf RG:Z:ZGO3HPVJRLW NM:i:3 MD:Z:23T1G24T1 ZA:Z:<@;0;0;;1;;>
      ILLUMINA_0142:3:1108:20971:8153#TGACCA/1 16 LASV-reference 2893 19 52M * 0 0 CTTTGGTCAAGTTGCTGTGAGCTCAAGTTGCCCATATAGACACCTGCACTCA caQRccbeefcecb^PI[dXeb[X`hefdXbSSgbSd_`Qb[eecSc``__^ RG:Z:ZGO3HPVJRLW NM:i:3 MD:Z:23T1G24T1 ZA:Z:<@;0;0;;1;;>
      ILLUMINA_0142:3:2102:12125:81885#TGACCA/1 16 LASV-reference 2894 18 51M1S * 0 0 TTTGGTCAAGTTGCTGTGAGCTCAAGTTGCCCATATAGACACCTGCACTCAG Z_f^fd_abhgbd`cfbdbbYJ`JRe\gebXec`e`Yb[cbabba\`cc__\ RG:Z:ZGO3HPVJRLW NM:i:3 MD:Z:22T1G24T1 ZA:Z:<@;0;0;;1;;>
      ILLUMINA_0142:3:1208:5666:190436#TGACCA/1 16 LASV-reference 2895 20 52M * 0 0 TTGGTCAAGTTGCTGTGAGCTCAAGTTGCCCATATAGACACCTGCACTCAAT c]dhee^ee^^Hehfe_deebdeZeehebgd_gafabQJJeeeeccccc___ RG:Z:ZGO3HPVJRLW NM:i:3 MD:Z:21T1G24T3 ZA:Z:<@;0;0;;1;;>
      ILLUMINA_0142:3:1204:18832:77734#TGACCA/1 0 LASV-reference 2897 19 49M * 0 0 GGTCAAGTTGCTGTGAGCTCAAGTTGCCCATATAGACACCTGCACTCAA abaeeeecggfggghhgfhihiiggiiiiiiiiiihiiiiiiiihiiih RG:Z:ZGO3HPVJRLW NM:i:3 MD:Z:19T1G24T2 ZA:Z:<@;0;0;;1;;>

      The .intervals, .fai and .dict files are exactly as described above - no further text in those.

      Thanks very much
      Last edited by kga1978; 11-16-2011, 04:56 PM. Reason: typo

      Comment

      • kga1978
        Senior Member
        • Nov 2010
        • 100

        #4
        Anybody any thoughts? This is driving me nuts and SRMA doesn't appear to be working either (separate post)

        Thanks in advance.

        Comment

        • swbarnes2
          Senior Member
          • May 2008
          • 910

          #5
          Does your fasta file really say "reference", and not "LASV-reference"?

          Comment

          • kga1978
            Senior Member
            • Nov 2010
            • 100

            #6
            Sorry, that is my bad - I tried to make another reference with just the word 'reference' - but the one I have been using correctly says 'LASV-reference' - I have corrected the typo.

            Comment

            • Bukowski
              Senior Member
              • Jan 2010
              • 388

              #7
              You do know you can get in touch directly with the GATK team here:



              They're very responsive to questions.

              Comment

              • JLand52
                Junior Member
                • Jul 2009
                • 1

                #8
                GATK is picky about the file name. Try changing the extension to ".interval_list"

                Comment

                Latest Articles

                Collapse

                • SEQadmin2
                  Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                  by SEQadmin2


                  I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                  Here are nine questions we think about, in roughly the order they matter, before...
                  06-18-2026, 07:11 AM
                • SEQadmin2
                  From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                  by SEQadmin2


                  Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                  The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                  ...
                  06-02-2026, 10:05 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by SEQadmin2, 06-26-2026, 11:10 AM
                0 responses
                15 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-17-2026, 06:09 AM
                0 responses
                49 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-09-2026, 11:58 AM
                0 responses
                107 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-05-2026, 10:09 AM
                0 responses
                125 views
                0 reactions
                Last Post SEQadmin2  
                Working...