Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merged bam and unified genotyper

    Dear All
    i have merged my six bam files (one sample run in 6 lanes) using Picard and used GATK pipeline for tunning the bam file and base calibration. after all these preliminary steps i did variant calling by unified genotyper. still i have not used variant quality score recalibration pipeline.
    I compared the result with the output generated for doing all these step on the single bamfile
    i am surprised to see that the merged bam is showing a much larger number then the single bamfile. Am i doing right or is it because of error
    e.g in single file chr1 has 2472 variants while in merged the number is 21853
    just to mention
    five bam files were from single sample run in six different lanes
    Any help will be highly appreciated

  • #2
    Did you check if GATK is treating the bam files as from a single sample or is it treating them as 6 different samples?

    Comment


    • #3
      how can i find that ???
      what i can understand from vcf output it is taking each lane as separate sample

      please suggest where i am wrong
      Last edited by huma Asif; 12-22-2013, 08:36 AM.

      Comment


      • #4
        Originally posted by huma Asif View Post
        please suggest where i am wrong
        When you created your individual read groups you defined each one as coming from a different sample.
        @RG ID:XP3_NoUVA PL:SOLID PU:1_1 LB:XP3_NoUVA PI:0 DS:75x35RR DT:2013-04-12T08:00:07-0300 SM:Free_Exomes_april2013_lane1
        @RG ID:XP3_NoUVA.1 PL:SOLID PU:1_2 LB:XP3_NoUVA PI:0 DS:75x35RR DT:2013-04-12T08:00:07-0300 SM:Free_Exomes_april2013_lane2
        @RG ID:XP3_NoUVA.2 PL:SOLID PU:1_3 LB:XP3_NoUVA PI:0 DS:75x35RR DT:2013-04-12T08:00:07-0300 SM:Free_Exomes_april2013_lane3
        @RG ID:XP3_NoUVA.3 PL:SOLID PU:1_4 LB:XP3_NoUVA PI:0 DS:75x35RR DT:2013-04-12T08:00:07-0300 SM:Free_Exomes_april2013_lane4
        @RG ID:XP3_NoUVA.4 PL:SOLID PU:1_5 LB:XP3_NoUVA PI:0 DS:75x35RR DT:2013-04-12T08:00:07-0300 SM:Free_Exomes_april2013_lane5
        @RG ID:XP3_NoUVA.5 PL:SOLID PU:1_6 LB:XP3_NoUVA PI:0 DS:75x35RR DT:2013-04-12T08:00:07-0300 SM:Free_Exomes_april2013_lane6
        If it is indeed the same sample run across several lanes then the sample definition for all read groups should be identical, e.g. SM:Free_Exomes_april2013. The lane identification is already included in the Platform Unit [PU:] tag for the read group.

        Comment


        • #5
          If re-doing the merge step is more computationally expensive, you could modify the BAM header and use samtools re-header to affix the new header to the bam before variant calling.

          Comment


          • #6
            Thank you guys for your help
            it fixed the problem

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Essential Discoveries and Tools in Epitranscriptomics
              by seqadmin


              The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
              Today, 07:01 AM
            • seqadmin
              Current Approaches to Protein Sequencing
              by seqadmin


              Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
              04-04-2024, 04:25 PM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 04-11-2024, 12:08 PM
            0 responses
            37 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 10:19 PM
            0 responses
            41 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-10-2024, 09:21 AM
            0 responses
            35 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 04-04-2024, 09:00 AM
            0 responses
            54 views
            0 likes
            Last Post seqadmin  
            Working...
            X