Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to get deletion count from bam file in IGV?

    I am trying to get a count of all bases, insertions and deletions from a bam file viewed in IGV genome viewer for each individual nucleotide. I can get the counts for the bases as a wig file (example shown below):

    track type=wiggle_0
    #Columns: Pos, Combined Strands A, Combined Strands C, Combined Strands G, Combined Strands T, Combined Strands N
    variableStep chrom=E17F span=1
    1 0 241 0 0 0
    2 337 0 0 0 0
    3 414 0 1 1 0
    4 6 412 8 5 0
    5 4 411 30 5 0
    6 475 6 2 0 0
    7 9 1 417 69 0

    However, I also really want to get the count of deletions for each position. It is possible to get this count if you go through each position individually and write down the numbers (as shown in the image). But I was thinking surely there must be a way to do this in IGV? I have tried igvtools but it still doesn't give me a deletion count. Any guidance or other suggestions will be greatly appreciated!

    Thanks
    -T
    Attached Files

  • #2
    I'd be surprised if IGV included a way to do that, it would almost never be used. You could either parse the output of "samtools mpileup" or just directly use a variant caller, as applicable.

    Comment


    • #3
      Originally posted by dpryan View Post
      I'd be surprised if IGV included a way to do that, it would almost never be used. You could either parse the output of "samtools mpileup" or just directly use a variant caller, as applicable.
      Thanks for the suggestions! I'm not very experienced in the bioinformatics, would you be able to clarify what you mean by these? I'm not sure how I would parse the output of mpileup. This is what I am trying with it but I am unsure what options to use to get the output I want.

      $ samtools mpileup -f reference.fasta -Q 13 input.sorted.bam -o output.txt

      Also directly using a variant caller, does that mean calling variants from my bam file - what sort of output can I get from that?

      Sorry for all the questions and thank you for your suggestions, I really appreciate it!!

      Comment


      • #4
        Maybe we should go about this in a different way. What is your biological goal? That is, what is the biological question that you're trying to answer by doing this?

        Comment


        • #5
          I have a mutant sample with a deletion and I want to see if the deletion count in this region is significantly higher compared to the wild type (using a new genome sequencer - so I want to see if it can detect deletions). But I want to get the base (ACGT, insertion, deletion) counts for every position so I can tell if the detected deletion is significant or if the sequencing device just has a high error rate. Hope that makes more sense!

          Comment


          • #6
            Right, so you want to call variants between the mutant and wild-type samples, paying attention to only a small region. Just use a variant caller and ensure that it finds the deletion in the mutant but not wild-type sample. I'd recommend GATK's haplotype caller, since it enables joint variant calling. If you really wanted, you could compare the genotyping probabilities between the samples (it's one of the fields in the VCF file produced). This assumes that the deletion isn't very large, in which case other tools targeted specifically toward finding large deletions might give better results.

            Comment


            • #7
              Thank you for your help, I will try this!

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM
              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              18 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              22 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              16 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              47 views
              0 likes
              Last Post seqadmin  
              Working...
              X