Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • mate alignment error when running ValidateSamFile in Picard

    Dear All
    I am trying to identify snps using GATK tool. Their manual suggested that I shall try to validate my bam files before entering the analytic pipeline.

    My samples are 100 nucleotides long (first 3 sites trimmed) paired reads.

    I generated bam files for each sample using tophat 1.33 and I sorted each bam file (one file per sample) using picard ReorderSam.jar,

    After that I added read group information using Picard AddOrReplaceReadGroups.jar.

    Then I ran a ValidateSamFile.jar to check whether these bam files are suitable in GATK pipeline. However, I got a lot of errors like this.

    ERROR: Record 21368, Read name HWI-ST978:1370AHMACXX:5:1308:13890:200017, Mate alignment does not match alignment start of mate
    ERROR: Record 21368, Read name HWI-ST978:1370AHMACXX:5:1308:13890:200017, Mate negative strand flag does not match read negative strand flag of mate
    ERROR: Record 21377, Read name HWI-ST978:1370AHMACXX:5:1308:13890:200017, Mate alignment does not match alignment start of mate
    ERROR: Record 21377, Read name HWI-ST978:1370AHMACXX:5:1308:13890:200017, Mate negative strand flag does not match read negative strand flag of mate
    ERROR: Record 21477, Read name HWI-ST978:1370AHMACXX:5:1306:2629:199972, Mate alignment does not match alignment start of mate
    ERROR: Record 21477, Read name HWI-ST978:1370AHMACXX:5:1306:2629:199972, Mate negative strand flag does not match read negative strand flag of mate
    ERROR: Record 21481, Read name HWI-ST978:1370AHMACXX:5:1306:2629:199972, Mate alignment does not match alignment start of mate
    ERROR: Record 21481, Read name HWI-ST978:1370AHMACXX:5:1306:2629:199972, Mate reference index (MRNM) does not match reference index of mate
    ERROR: Record 21490, Read name HWI-ST978:1370AHMACXX:5:2108:3190:30134, Mate alignment does not match alignment start of mate
    ERROR: Record 21490, Read name HWI-ST978:1370AHMACXX:5:2108:3190:30134, Mate negative strand flag does not match read negative strand flag of mate
    Can anyone tell me where the problem is? Why do I have so many unmatched pairs identified at this step. Should I delete them before the analysis?

    Thanks a lot.

  • #2
    I am seeing the same errors. Does anyone have any insight into this?

    Comment


    • #3
      I also encounter this problem, and now I am using the picard/FixMateInformation.jar to fix it, maybe you can try

      Comment


      • #4
        I am also seeing these errors. I tried using Picard's FixMateInformation, but it did not resolve the errors. They seem to be appearing after I use Picard's MarkDuplicates. Is anyone else having this problem/know how to fix it?

        Comment


        • #5
          Originally posted by tatumdmortimer View Post
          I am also seeing these errors. I tried using Picard's FixMateInformation, but it did not resolve the errors. They seem to be appearing after I use Picard's MarkDuplicates. Is anyone else having this problem/know how to fix it?
          I ValidateSamFile before MarkDuplicates, these errors still occur.

          Has anyone solved this problem?

          Comment


          • #6
            Same

            I have the exact same problem tat I can't solve ...
            Maybe it have a link but I also have an exception when trying the CollectGcBiasMetrics command

            Code:
            Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 142549452
            	at net.sf.picard.analysis.CollectGcBiasMetrics.doWork(CollectGcBiasMetrics.java:152)
            	at net.sf.picard.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:179)
            	at net.sf.picard.analysis.CollectGcBiasMetrics.main(CollectGcBiasMetrics.java:95)

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM
            • seqadmin
              Techniques and Challenges in Conservation Genomics
              by seqadmin



              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

              Avian Conservation
              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
              03-08-2024, 10:41 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, Yesterday, 06:37 PM
            0 responses
            11 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, Yesterday, 06:07 PM
            0 responses
            10 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-22-2024, 10:03 AM
            0 responses
            51 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-21-2024, 07:32 AM
            0 responses
            68 views
            0 likes
            Last Post seqadmin  
            Working...
            X