Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GATK complains that my bam file isn't indexed

    I am running GATK's RealignerTargetCreator with this command:

    java -Xmx36g -jar GenomeAnalysisTK.jar -S LENIENT -T RealignerTargetCreator -R human_g1k_v37.fasta -o SRR098359.interval_list -I SRR098359.bam -B:snps,VCF 00-All.vcf

    The process quits with an error that includes this:

    Cannot process the provided BAM file(s) because they were not indexed.

    However, the bam file WAS indexed. I see the .bai file there. I recreated the index with the following command (in case something had gone wrong creating it):

    samtools index SRR098359_sorted.bam

    It created an identical .bai file and I ran RealignerTargetCreator again, and the same thing happened. Does anyone know what I'm doing wrong?

    Thank you.

    Eric

  • #2
    With a quick glance of what they require, it seems you may require your bam file to be coordinate sorted (before .bai file creation). You should have a look at picard tools.

    Comment


    • #3
      Originally posted by cedance View Post
      With a quick glance of what they require, it seems you may require your bam file to be coordinate sorted (before .bai file creation). You should have a look at picard tools.
      Hi cedance,

      Thanks for the suggestion, but I don't think this is my problem. My previous command coordinate-sorted them:

      java -jar /home/efoss/sequencing/picard-tools-1.52/SortSam.jar VALIDATION_STRINGENCY=LENIENT INPUT=SRR098359.bam OUTPUT=SRR098359_sorted.bam SORT_ORDER=coordinate

      Eric

      Comment


      • #4
        One last thing I could think of (the documentation says 1 or more aligned bam files as input). After you mapped with the software of your choice (the reads to your reference), did you obtain aligned reads alone? Maybe you should try using picard tools "ViewSam" with ALIGNMENT_STATUS=aligned to obtain the aligned reads from the bam file and then sort and index it. I would use picard tools for every operation instead of samtools. Sorry, I couldn't be of more help, but I guess this is worth a try.

        Comment


        • #5
          Maybe a typo, but why are you not using the SRR098359_sorted.bam file when you call GATK? Your command says you are using the unsorted BAM file.

          Comment


          • #6
            Hi maubp,

            THANK YOU, THANK YOU, THANK YOU!!!!!!!!! I stared at that so long without seeing my mistake. I feel very stupid, but also very grateful that you caught it.

            Best wishes,

            Eric

            Comment


            • #7


              Happy to help.

              Comment


              • #8
                I get 2 different error messages when I run gatk

                If I use the output of picard markedduplicate, I get error message on unindexed bam file whereas the bam file is already indexed as it is already generated by picard samsort before invoking picard markedduplicate. bai file exist too.

                And if I use the output of picard sortsam directly, I get
                ERROR MESSAGE: Bad input: We encountered a non-standard non-IUPAC base in the provided reference: '10'

                What would you advise?

                Thanks,

                Carol
                -----------------------------------
                java -jar SortSam.jar SO=coordinate INPUT=~/NGS/data/SRR062641.filt.sam OUTPUT=~/NGS/data/SRR062641.filt.bam VALIDATION_STRINGENCY=LENIENT CREATE_INDEX=true

                - no error is generated

                ~/NGS/pgm/GenomeAnalysisTK-2.4-9-g532efad$ java -jar GenomeAnalysisTK.jar -T RealignerTargetCreator -R /home/carolw/NGS/hg19/Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta/genome.fa -o ~/NGS/data/SRR062641.filt.bam.list -I ~/NGS/data/SRR062641.filt.bam

                ERROR MESSAGE: Bad input: We encountered a non-standard non-IUPAC base in the provided reference: '10'

                -----------------------------------------------------------
                java -jar MarkDuplicates.jar INPUT=~/NGS/data/SRR062641.filt.bam OUTPUT=~/NGS/data/SRR062641.filt.marked.bam METRICS_FILE=metrics VALIDATION_STRINGENCY=LENIENT CREATE_INDEX=true

                - no error is generated

                java -jar GenomeAnalysisTK.jar -T RealignerTargetCreator -R /home/carolw/NGS/hg19/Homo_sapiens/UCSC/hg19/Sequence/WholeGenomeFasta/genome.fa -o ~/NGS/data/SRR062641.filt.bam.list -I ~/NGS/data/SRR062641.filt.marked.bam

                ERROR MESSAGE: Invalid command line: Cannot process the provided BAM file(s) because they were not indexed. The GATK does offer limited processing of unindexed BAMs in --unsafe mode, but this GATK feature is currently unsupported.

                Comment


                • #9
                  Hi CarolW,

                  Sorry - I don't know what to suggest other than to look very carefully at the name of the index file compared to the name of the bam file.

                  Good luck.

                  Eric

                  Comment


                  • #10
                    I used "samtools index bamfile" created a bam.bai file, then i ran again.it was successful. thanks a lot.

                    Comment

                    Latest Articles

                    Collapse

                    • seqadmin
                      Recent Advances in Sequencing Analysis Tools
                      by seqadmin


                      The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
                      Today, 07:48 AM
                    • seqadmin
                      Essential Discoveries and Tools in Epitranscriptomics
                      by seqadmin




                      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
                      04-22-2024, 07:01 AM

                    ad_right_rmr

                    Collapse

                    News

                    Collapse

                    Topics Statistics Last Post
                    Started by seqadmin, Today, 07:17 AM
                    0 responses
                    7 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 05-02-2024, 08:06 AM
                    0 responses
                    19 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-30-2024, 12:17 PM
                    0 responses
                    20 views
                    0 likes
                    Last Post seqadmin  
                    Started by seqadmin, 04-29-2024, 10:49 AM
                    0 responses
                    28 views
                    0 likes
                    Last Post seqadmin  
                    Working...
                    X