Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GATK interval_list file header format and errors

    Hi All,
    I am trying to use GATK unified genotyper with -L option. The command works fine with out option but failing with L option.
    ##################
    @HD VN:1.0 SO:unsorted
    @SQ SN:chr1 LN:249250621 AS:hg19 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:1b22b98cdeb4a9304cb5d48026a85128 SP:Homo Sapien
    @SQ SN:chr2 LN:243199373 AS:hg19 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:a0d9851da00400dec1098a9255ac712e SP:Homo Sapien
    @SQ SN:chr3 LN:198022430 AS:hg19 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:641e4338fa8d52a5b781bd2a2c08d3c3 SP:Homo Sapien
    @SQ SN:chr4 LN:191154276 AS:hg19 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:23dccd106897542ad87d2765d28a19a1 SP:Homo Sapien
    @SQ SN:chr5 LN:180915260 AS:hg19 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:0740173db9ffd264d728f32784845cd7 SP:Homo Sapien
    @SQ SN:chr6 LN:171115067 AS:hg19 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:1d3a93a248d92a729ee764823acbbc6b SP:Homo Sapien
    @SQ SN:chr7 LN:159138663 AS:hg19 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:618366e953d6aaad97dbe4777c29375e SP:Homo Sapien
    @SQ SN:chr8 LN:146364022 AS:hg19 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:96f514a9929e410c6651697bded59aec SP:Homo Sapien
    @SQ SN:chr9 LN:141213431 AS:hg19 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:3e273117f15e0a400f01055d9f393768 SP:Homo Sapien
    @SQ SN:chr10 LN:135534747 AS:hg19 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:988c28e000e84c26d552359af1ea2e1d SP:Homo Sapien
    @SQ SN:chr11 LN:135006516 AS:hg19 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:98c59049a2df285c76ffb1c6db8f8b96 SP:Homo Sapien
    @SQ SN:chr12 LN:133851895 AS:hg19 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:51851ac0e1a115847ad36449b0015864 SP:Homo Sapien
    @SQ SN:chr13 LN:115169878 AS:hg19 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:283f8d7892baa81b510a015719ca7b0b SP:Homo Sapien
    @SQ SN:chr14 LN:107349540 AS:hg19 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:98f3cae32b2a2e9524bc19813927542e SP:Homo Sapien
    @SQ SN:chr15 LN:102531392 AS:hg19 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:e5645a794a8238215b2cd77acb95a078 SP:Homo Sapien
    @SQ SN:chr16 LN:90354753 AS:hg19 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:fc9b1a7b42b97a864f56b348b06095e6 SP:Homo Sapien
    @SQ SN:chr17 LN:81195210 AS:hg19 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:351f64d4f4f9ddd45b35336ad97aa6de SP:Homo Sapien
    @SQ SN:chr18 LN:78077248 AS:hg19 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:b15d4b2d29dde9d3e4f93d1d0f2cbc9c SP:Homo Sapien
    @SQ SN:chr19 LN:59128983 AS:hg19 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:1aacd71f30db8e561810913e0b72636d SP:Homo Sapien
    @SQ SN:chr20 LN:63025520 AS:hg19 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:0dec9660ec1efaaf33281c0d5ea2560f SP:Homo Sapien
    @SQ SN:chr21 LN:48129895 AS:hg19 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:2979a6085bfe28e3ad6f552f361ed74d SP:Homo Sapien
    @SQ SN:chr22 LN:51304566 AS:hg19 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:a718acaa6135fdca8357d5bfe94211dd SP:Homo Sapien
    @SQ SN:chrX LN:155270560 AS:hg19 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:7e0e2e580297b7764e31dbc80c2540dd SP:Homo Sapien
    @SQ SN:chrY LN:59373566 AS:hg19 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:1e86411d73e6f00a10590f976be01623 SP:Homo Sapien
    chr1 564490 564532 + target_1
    chr1 564533 564534 + target_2
    chr1 564672 564718 + target_3
    chr1 564720 564721 + target_4
    ##########################################
    This is second header I tried
    @HD VN:1.0 SO:unsorted
    @SQ SN:chr1 LN:249250621 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa
    @SQ SN:chr2 LN:243199373 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa
    @SQ SN:chr3 LN:198022430 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa
    @SQ SN:chr4 LN:191154276 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa
    @SQ SN:chr5 LN:180915260 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa
    @SQ SN:chr6 LN:171115067 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa
    @SQ SN:chr7 LN:159138663 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa
    @SQ SN:chr8 LN:146364022 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa
    @SQ SN:chr9 LN:141213431 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa
    @SQ SN:chr10 LN:135534747 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa
    @SQ SN:chr11 LN:135006516 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa
    @SQ SN:chr12 LN:133851895 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa
    @SQ SN:chr13 LN:115169878 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa
    @SQ SN:chr14 LN:107349540 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa
    @SQ SN:chr15 LN:102531392 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa
    @SQ SN:chr16 LN:90354753 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa
    @SQ SN:chr17 LN:81195210 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa
    @SQ SN:chr18 LN:78077248 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa
    @SQ SN:chr19 LN:59128983 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa
    @SQ SN:chr20 LN:63025520 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa
    @SQ SN:chr21 LN:48129895 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa
    @SQ SN:chr22 LN:51304566 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa
    @SQ SN:chrX LN:155270560 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa
    @SQ SN:chrY LN:59373566 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa
    chr1 564490 564532 + target_1
    chr1 564533 564534 + target_2
    chr1 564672 564718 + target_3
    chr1 564720 564721 + target_4
    #####################
    I am not understanding the exact header format . I tried googling lot but didn't got how should be header
    #############
    This is error I am getting

    INFO 17:05:07,240 ProgressMeter - Location processed.sites runtime per.1M.sites completed total.runtime remaining
    INFO 17:05:30,725 ProgressMeter - done 1.47e+07 23.0 s 1.0 s 100.0% 23.0 s 0.0 s
    INFO 17:05:30,726 ProgressMeter - Total runtime 23.49 secs, 0.39 min, 0.01 hours
    INFO 17:05:30,728 MicroScheduler - 0 reads were filtered out during the traversal out of approximately 332130 total reads (0.00%)
    INFO 17:05:30,728 MicroScheduler - -> 0 reads (0.00% of total) failing BadMateFilter
    INFO 17:05:30,728 MicroScheduler - -> 0 reads (0.00% of total) failing DuplicateReadFilter
    INFO 17:05:30,729 MicroScheduler - -> 0 reads (0.00% of total) failing FailsVendorQualityCheckFilter
    INFO 17:05:30,730 MicroScheduler - -> 0 reads (0.00% of total) failing MalformedReadFilter
    INFO 17:05:30,730 MicroScheduler - -> 0 reads (0.00% of total) failing MappingQualityUnavailableFilter
    INFO 17:05:30,730 MicroScheduler - -> 0 reads (0.00% of total) failing NotPrimaryAlignmentFilter
    INFO 17:05:30,731 MicroScheduler - -> 0 reads (0.00% of total) failing UnmappedReadFilter
    INFO 17:05:31,962 GATKRunReport - Uploaded run statistics report to AWS S3
    ###########################

  • #2
    I tried this header too
    @HD VN:1.0 SO:unsorted
    @SQ SN:chr1 LN:249250621 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:1b22b98cdeb4a9304cb5d48026a85128
    @SQ SN:chr2 LN:243199373 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:a0d9851da00400dec1098a9255ac712e
    @SQ SN:chr3 LN:198022430 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:641e4338fa8d52a5b781bd2a2c08d3c3
    @SQ SN:chr4 LN:191154276 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:23dccd106897542ad87d2765d28a19a1
    @SQ SN:chr5 LN:180915260 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:0740173db9ffd264d728f32784845cd7
    @SQ SN:chr6 LN:171115067 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:1d3a93a248d92a729ee764823acbbc6b
    @SQ SN:chr7 LN:159138663 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:618366e953d6aaad97dbe4777c29375e
    @SQ SN:chr8 LN:146364022 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:96f514a9929e410c6651697bded59aec
    @SQ SN:chr9 LN:141213431 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:3e273117f15e0a400f01055d9f393768
    @SQ SN:chr10 LN:135534747 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:988c28e000e84c26d552359af1ea2e1d
    @SQ SN:chr11 LN:135006516 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:98c59049a2df285c76ffb1c6db8f8b96
    @SQ SN:chr12 LN:133851895 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:51851ac0e1a115847ad36449b0015864
    @SQ SN:chr13 LN:115169878 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:283f8d7892baa81b510a015719ca7b0b
    @SQ SN:chr14 LN:107349540 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:98f3cae32b2a2e9524bc19813927542e
    @SQ SN:chr15 LN:102531392 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:e5645a794a8238215b2cd77acb95a078
    @SQ SN:chr16 LN:90354753 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:fc9b1a7b42b97a864f56b348b06095e6
    @SQ SN:chr17 LN:81195210 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:351f64d4f4f9ddd45b35336ad97aa6de
    @SQ SN:chr18 LN:78077248 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:b15d4b2d29dde9d3e4f93d1d0f2cbc9c
    @SQ SN:chr19 LN:59128983 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:1aacd71f30db8e561810913e0b72636d
    @SQ SN:chr20 LN:63025520 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:0dec9660ec1efaaf33281c0d5ea2560f
    @SQ SN:chr21 LN:48129895 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:2979a6085bfe28e3ad6f552f361ed74d
    @SQ SN:chr22 LN:51304566 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:a718acaa6135fdca8357d5bfe94211dd
    @SQ SN:chrX LN:155270560 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:7e0e2e580297b7764e31dbc80c2540dd
    @SQ SN:chrY LN:59373566 /mnt/idash/Genomics/data_ressources/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:1e86411d73e6f00a10590f976be01623
    chr1 564490 564532 + target_1
    chr1 564533 564534 + target_2
    chr1 564672 564718 + target_3
    chr1 564720 564721 + target_4
    chr1 564722 564724 + target_5

    Comment


    • #3
      GATK -L interval files do not need a header. They should be this format:

      chr1:1000-1200
      chr1:2004-2507
      chr2:457290-457400

      etc...

      Comment


      • #4
        Can also be tab separated:
        Code:
        CHR   POS1   POS2

        Comment


        • #5
          It is not working I did like this
          chr1:564490-564532 + target_1
          chr1:564533-564534 + target_2
          chr1:564672-564718 + target_3
          chr1:564720-564721 + target_4
          chr1:564722-564724 + target_5
          chr1:564739-564741 + target_6
          chr1:564771-564774 + target_7
          chr1:564776-564807 + target_8
          chr1:564898-564956 + target_9
          chr1:564965-564966 + target_10
          ########################
          ##### ERROR MESSAGE: File associated with name /mnt/oncogxA/anusha/DGFDATA_Tissue_DATA/intersectalltissues/hg19_lite_copy_HR5.interval_list is malformed: Interval file could not be parsed in any supported format. caused by Failed to parse Genome Location string: chr1:564490-564532 + target_1
          ##### ERROR --

          Comment


          • #6
            chr1 564490 564532
            chr1 564533 564534
            chr1 564672 564718
            chr1 564720 564721
            chr1 564722 564724
            chr1 564739 564741
            chr1 564771 564774
            chr1 564776 564807
            ##### ERROR MESSAGE: Badly formed genome loc: Contig 'chr1 564490 564532' does not match any contig in the GATK sequence dictionary derived from the reference; are you sure you are using the correct reference fasta file?
            ##### ERROR -----
            I am used same reference fast file many times it worked fine .

            Comment


            • #7
              Are your chromosome names "chr1", "chr2", "chr3" etc?

              They seem to be "chr2 LN:243199373" or something from previous posts.

              You have to use the exact same chromosome naming format that is in the reference fasta. Otherwise the software cannot find the chromosome!

              When I first encountered this error I made a fasta with chromosomes name "1", "2", "3" etc for ease of use. Might be an idea.
              Last edited by bruce01; 06-26-2014, 08:17 AM. Reason: Clarity

              Comment


              • #8
                1 564490 564532 + target_1
                1 564533 564534 + target_2
                1 564672 564718 + target_3
                1 564720 564721 + target_4
                1 564722 564724 + target_5
                1 564739 564741 + target_6
                1 564771 564774 + target_7
                1 564776 564807 + target_8
                1 564898 564956 + target_9
                1 564965 564966 + target_10

                I tried like this too but did not work

                Comment


                • #9
                  @HD VN:1.0 SO:unsorted
                  @SQ SN:chr1 LN:249250621 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:1b22b98cdeb4a9304cb5d48026a85128
                  @SQ SN:chr2 LN:243199373 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:a0d9851da00400dec1098a9255ac712e
                  @SQ SN:chr3 LN:198022430 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:641e4338fa8d52a5b781bd2a2c08d3c3
                  @SQ SN:chr4 LN:191154276 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:23dccd106897542ad87d2765d28a19a1
                  @SQ SN:chr5 LN:180915260 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:0740173db9ffd264d728f32784845cd7
                  @SQ SN:chr6 LN:171115067 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:1d3a93a248d92a729ee764823acbbc6b
                  @SQ SN:chr7 LN:159138663 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:618366e953d6aaad97dbe4777c29375e
                  @SQ SN:chr8 LN:146364022 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:96f514a9929e410c6651697bded59aec
                  @SQ SN:chr9 LN:141213431 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:3e273117f15e0a400f01055d9f393768
                  @SQ SN:chr10 LN:135534747 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:988c28e000e84c26d552359af1ea2e1d
                  @SQ SN:chr11 LN:135006516 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:98c59049a2df285c76ffb1c6db8f8b96
                  @SQ SN:chr12 LN:133851895 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:51851ac0e1a115847ad36449b0015864
                  @SQ SN:chr13 LN:115169878 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:283f8d7892baa81b510a015719ca7b0b
                  @SQ SN:chr14 LN:107349540 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:98f3cae32b2a2e9524bc19813927542e
                  @SQ SN:chr15 LN:102531392 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:e5645a794a8238215b2cd77acb95a078
                  @SQ SN:chr16 LN:90354753 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:fc9b1a7b42b97a864f56b348b06095e6
                  @SQ SN:chr17 LN:81195210 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:351f64d4f4f9ddd45b35336ad97aa6de
                  @SQ SN:chr18 LN:78077248 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:b15d4b2d29dde9d3e4f93d1d0f2cbc9c
                  @SQ SN:chr19 LN:59128983 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:1aacd71f30db8e561810913e0b72636d
                  @SQ SN:chr20 LN:63025520 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:0dec9660ec1efaaf33281c0d5ea2560f
                  @SQ SN:chr21 LN:48129895 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:2979a6085bfe28e3ad6f552f361ed74d
                  @SQ SN:chr22 LN:51304566 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:a718acaa6135fdca8357d5bfe94211dd
                  @SQ SN:chrX LN:155270560 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:7e0e2e580297b7764e31dbc80c2540dd
                  @SQ SN:chrY LN:59373566 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:1e86411d73e6f00a10590f976be01623
                  1 564490 564532 + target_1
                  1 564533 564534 + target_2
                  1 564672 564718 + target_3

                  It give me sth like this

                  WARNING 2014-06-26 09:32:07 IntervalList Ignoring interval for unknown reference: X:153990999-153991009
                  WARNING 2014-06-26 09:32:07 IntervalList Ignoring interval for unknown reference: X:153991010-153991011
                  WARNING 2014-06-26 09:32:07 IntervalList Ignoring interval for unknown reference: X:153991012-153991016
                  WARNING 2014-06-26 09:32:07 IntervalList Ignoring interval for unknown reference: Y:8979526-8979550
                  WARN 09:32:07,516 IntervalUtils - The interval file /mnt/oncogxA/anusha/DGFDATA_Tissue_DATA/intersectalltissues/test.interval_list contains no intervals that could be parsed.
                  INFO 09:32:07,518 IntervalUtils - Processing 0 bp from intervals
                  WARN 09:32:07,520 GenomeAnalysisEngine - The given combination of -L and -XL options results in an empty set. No intervals to process.
                  INFO 09:32:07,611 GenomeAnalysisEngine - Preparing for traversal over 1 BAM files
                  INFO 09:32:07,614 GenomeAnalysisEngine - Done preparing for traversal

                  Comment


                  • #10
                    Your reference chromosome names are: "chr1 LN:249250621" etc.

                    So make your interval file with those:

                    Code:
                    chr1	LN:249250621	564490	564532
                    chr1	LN:249250621	564533	564534
                    chr1	LN:249250621	564672	564718
                    chr1	LN:249250621	564720	564721
                    Unless you do this, GATK cannot parse the interval file.

                    Comment


                    • #11
                      achimmiri@idash-cloud-707:/mnt/oncogxA/anusha/DGFDATA_Tissue_DATA/intersectalltissues$ head test1.subtract.interval_list
                      chr1 LN:249250621 564490 564532
                      chr1 LN:249250621 564533 564534
                      chr1 LN:249250621 564672 564718
                      chr1 LN:249250621 564720 564721
                      chr1 LN:249250621 564722 564724
                      I used this with and without header
                      t
                      @HD VN:1.0 SO:unsorted
                      @SQ SN:chr1 LN:249250621 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:1b22b98cdeb4a9304cb5d48026a85128
                      @SQ SN:chr2 LN:243199373 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:a0d9851da00400dec1098a9255ac712e
                      @SQ SN:chr3 LN:198022430 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:641e4338fa8d52a5b781bd2a2c08d3c3
                      @SQ SN:chr4 LN:191154276 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:23dccd106897542ad87d2765d28a19a1
                      @SQ SN:chr5 LN:180915260 UR:file:/raid/references-and-indexes/hg19/hg19_lite/hg19_lite.fa M5:0740173db9ffd264d728f32784845cd7

                      But still not working in any possible way.

                      Comment


                      • #12
                        I still not able to figure out what is the exact format for interval list . I tried every thing so far none of them worked . I checked my dict file my header is exactly same as dict filr
                        but I am not sure what does body part should look like .

                        Comment


                        • #14
                          I have run into the same exact problems!
                          I have been trying so many formats with different file suffix (.bed .list .intervals .interval_list) but only .bed and .interval_list worked for me so far.

                          .bed format:
                          <chr> <start> <end>

                          .interval_list format:
                          <chr>:<start>-<end>

                          However, since .bed is 0-based and .interval_list wouldn't allow any annotation after the interval on each line, I really want to get the .list format working.

                          I tried the gatk recommendation for .list format and even tried coping it but none worked for me.

                          I would like to know how did AnushaC end up solving this problem.

                          Comment


                          • #15
                            so i decided to test out what format can be used and got the following results:

                            command:
                            java ../GenomeAnalysisTK-3.3-0/GenomeAnalysisTK.jar -T DepthOfCoverage -I <in.bam> -R <hg19.fa> -L <test.list> --interval_merging OVERLAPPING_ONLY -o <out.file>
                            ---------------------------------------------------------------------------------------
                            test1.list

                            chr1 69089 70010
                            chr1 367657 368599
                            chr1 621094 622036
                            chr1 861320 861395
                            chr1 865533 865718

                            test1 result:
                            ##### ERROR MESSAGE: Badly formed genome loc: Contig 'chr1 69089 70010' does not match any contig in the GATK sequence dictionary derived from the reference; are you sure you are using the correct reference fasta file?
                            ---------------------------------------------------------------------------------------
                            test2.list

                            chr1:69089-70010
                            chr1:367657-368599
                            chr1:621094-622036
                            chr1:861320-861395
                            chr1:865533-865718

                            test2 result:
                            no error message and all results have been correctly calculated.
                            ---------------------------------------------------------------------------------------
                            test3.list (this format is recommended on the gatk website)

                            @HD VN:1.0 SO:unsorted
                            @SQ SN:chr1 LN:249250621 UR:file:/humgen/gsa-hpprojects/GATK/bundle/ucsc.hg19/ucsc.hg19.fasta M5:1b22b98cdeb4a9304cb5d48026a85128
                            @SQ SN:chr2 LN:243199373 UR:file:/humgen/gsa-hpprojects/GATK/bundle/ucsc.hg19/ucsc.hg19.fasta M5:a0d9851da00400dec1098a9255ac712e
                            ...
                            ...
                            @SQ SN:chrUn_gl000248 LN:39786 UR:file:/humgen/gsa-hpprojects/GATK/bundle/ucsc.hg19/ucsc.hg19.fasta M5:5a8e43bec9be36c7b49c84d585107776
                            @SQ SN:chrUn_gl000249 LN:38502 UR:file:/humgen/gsa-hpprojects/GATK/bundle/ucsc.hg19/ucsc.hg19.fasta M5:1d78abec37c15fe29a275eb08d5af236
                            chr1 69089 70010
                            chr1 367657 368599
                            chr1 621094 622036
                            chr1 861320 861395
                            chr1 865533 865718

                            test3 result:
                            ##### ERROR
                            ##### ERROR MESSAGE: File associated with name test3.list is malformed: Interval file could not be parsed in any supported format. caused by Failed to parse Genome Location string: @HD VN:1.0 SO:unsorted
                            ---------------------------------------------------------------------------------------
                            test4.list

                            @HD VN:1.0 SO:unsorted
                            @SQ SN:chr1 LN:249250621 UR:file:/humgen/gsa-hpprojects/GATK/bundle/ucsc.hg19/ucsc.hg19.fasta M5:1b22b98cdeb4a9304cb5d48026a85128
                            @SQ SN:chr2 LN:243199373 UR:file:/humgen/gsa-hpprojects/GATK/bundle/ucsc.hg19/ucsc.hg19.fasta M5:a0d9851da00400dec1098a9255ac712e
                            ...
                            ...
                            @SQ SN:chrUn_gl000248 LN:39786 UR:file:/humgen/gsa-hpprojects/GATK/bundle/ucsc.hg19/ucsc.hg19.fasta M5:5a8e43bec9be36c7b49c84d585107776
                            @SQ SN:chrUn_gl000249 LN:38502 UR:file:/humgen/gsa-hpprojects/GATK/bundle/ucsc.hg19/ucsc.hg19.fasta M5:1d78abec37c15fe29a275eb08d5af236
                            chr1:69089-70010
                            chr1:367657-368599
                            chr1:621094-622036
                            chr1:861320-861395
                            chr1:865533-865718

                            test4 result:
                            ##### ERROR MESSAGE: File associated with name test4.list is malformed: Interval file could not be parsed in any supported format. caused by Failed to parse Genome Location string: @HD VN:1.0 SO:unsorted
                            ---------------------------------------------------------------------------------------
                            test5.list

                            chr1 69089 70010 + A
                            chr1 367657 368599 + B
                            chr1 621094 622036 + C
                            chr1 861320 861395 + D
                            chr1 865533 865718 + E

                            test5 result:
                            ##### ERROR MESSAGE: Badly formed genome loc: Contig 'chr1 69089 70010 + A' does not match any contig in the GATK sequence dictionary derived from the reference; are you sure you are using the correct reference fasta file?
                            ---------------------------------------------------------------------------------------
                            test6.list

                            chr1:69089-70010 + target1
                            chr1:367657-368599 + target2
                            chr1:621094-622036 + target3
                            chr1:861320-861395 + target4
                            chr1:865533-865718 + target5

                            test6 result:
                            ##### ERROR MESSAGE: File associated with name test6.list is malformed: Interval file could not be parsed in any supported format. caused by Failed to parse Genome Location string: chr1:69089-70010 + target1
                            ---------------------------------------------------------------------------------------
                            In test2 the format <chr>:<start>-<end> worked well as rbagnall mentioned:
                            Originally posted by rbagnall View Post
                            GATK -L interval files do not need a header. They should be this format:

                            chr1:1000-1200
                            chr1:2004-2507
                            chr2:457290-457400

                            etc...
                            However I would prefer using the format <chr> <start> <end> if possible for several reasons.
                            Overall i am not sure why test1 or test3 wouldn't work
                            sorry for the long post
                            Last edited by QazSeDc; 10-07-2016, 12:17 AM.

                            Comment

                            Latest Articles

                            Collapse

                            • seqadmin
                              Current Approaches to Protein Sequencing
                              by seqadmin


                              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                              04-04-2024, 04:25 PM
                            • seqadmin
                              Strategies for Sequencing Challenging Samples
                              by seqadmin


                              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                              03-22-2024, 06:39 AM

                            ad_right_rmr

                            Collapse

                            News

                            Collapse

                            Topics Statistics Last Post
                            Started by seqadmin, 04-11-2024, 12:08 PM
                            0 responses
                            18 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-10-2024, 10:19 PM
                            0 responses
                            22 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-10-2024, 09:21 AM
                            0 responses
                            16 views
                            0 likes
                            Last Post seqadmin  
                            Started by seqadmin, 04-04-2024, 09:00 AM
                            0 responses
                            47 views
                            0 likes
                            Last Post seqadmin  
                            Working...
                            X