Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multi sample SNP calling

    Hi all,

    I have two bam files, bam1 and bam2 with two different read groups. I would like GATK to treat as a single sample and do the SNP calling. And it can be done by giving same read group names to both the bam files and call SNP's by pooling them to a single bam file.

    But my interest is, Lets say a SNP i covered by 30 reads, i am interested in finding out number of reads that have come from bam1 and number of reads from bam2.

    How can we distinguish between the reads from two bam files after merging them into a single bam file with same read group name?

    Is there a way or any tool to achieve this? Any suggestions!!!

  • #2
    Once you know the snp location you could copy it to a separate file and use bedtools' intersection command between that location and each of your bam files. If the snp output is in vcf format then you just need to make a new file with only the row of your snp in it. The bed tools command might go like this:

    Code:
    bedtools intersect -wa -bed -abam bam1.bam -b snp.vcf > bam1hits.bed
    If you leave out the -bed option it will produce a bam file in case you'd rather keep that format for any downstream analysis.
    /* Shawn Driscoll, Gene Expression Laboratory, Pfaff
    Salk Institute for Biological Studies, La Jolla, CA, USA */

    Comment


    • #3
      The bam header has separate fields for read group ID @RG and sample name @SN. You could extract the header from the bam files using samtools view, write a one-liner to modify sample name while keeping the readgroup names different and use samtools reheader to re-add the header to the corresponding bam files.

      Comment


      • #4
        Thank you both for the suggestions. I have figured out a way using GATK multi-sample snp calling which is very straight forward. Variant calling is performed using GATK for both the bam files in a single step which gives a single vcf output file. The output file has all the variants detected with the total depth of each variant and along with it there are specific fields for each bam file which gives the number of reads coming from each of these bam files.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM
        • seqadmin
          Techniques and Challenges in Conservation Genomics
          by seqadmin



          The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

          Avian Conservation
          Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
          03-08-2024, 10:41 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 03-27-2024, 06:37 PM
        0 responses
        12 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-27-2024, 06:07 PM
        0 responses
        11 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-22-2024, 10:03 AM
        0 responses
        53 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 03-21-2024, 07:32 AM
        0 responses
        69 views
        0 likes
        Last Post seqadmin  
        Working...
        X