Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • Joker!sAce
    Member
    • Feb 2011
    • 21

    Understanding BAM format.

    Hi,

    I have this output in BAM format.

    NA06984-SRR006041.1145152 1040 1 113040605 57 325M * 0 0 TTGATCACTTCACACACATCTTCATCGATGAGGCTGGCCA
    CTGCATGGAGCCTGAGAGTCTGGTAGCTATAGCAGGTGAGGGACTCAGGTGGGGCTGCAGGTATACACCCTGTGTGGGTCAGAGAGGTTGCACCACTTACCTTTCTTCCCACACCTCTTCTGCTTCCCAGGGCTGATGGAAGTA
    AAGGAAACAGGTGATCCAGGAGGGCAGCTGGTGCTGGCAGGAGACCCTCGGCAGCTGGGGCCTGTGCTGCGTTCCCCACTGACCCAGAAGCATGGACTGGGATACTCACTGCTGGAGCGGCTGCTCACCTACAACTCCCTG 7
    99::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::88:::::::::::;;;;;;;;;;;;;::888:;;;;;;;;;;;;;;;;;;;;;;;;;;
    ;;888;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;:9::;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;::::;;;;;;;;;;;;;;;;;;;;;;;;;;;;
    ;;;;;;;;;;;::::::::::::::::::::::::: RG:Z:SRR006041 NM:i:0
    (This is data from the 1000 genomics project.)

    I'm constructing a pipeline to study variations (I get fast-q sequence, index it, align it to ref.seq hg18, do a couple of format conversions and get BAM, call indels and snps, add them to a db, call larger variations, look if they've been reported before, give out fancy graphs and charts, display the alignment, submit a report).

    I'm learning about BWA aligner and the BAM format right now. I'm using pilot data on un-aligned sequences from the 1000 genomes project (because I will have similar BAM outputs).

    I have to study and make sense out of this BAM format. I've read this tutorial on understanding the SAM/ BAM format with little help. Could someone give me further pointers?

    Thanks a lot!
    Joker!sAce
    Last edited by Joker!sAce; 02-28-2011, 07:15 AM.
  • nilshomer
    Nils Homer
    • Nov 2008
    • 1283

    #2
    What specific questions about the format do you have?

    Comment

    • Joker!sAce
      Member
      • Feb 2011
      • 21

      #3
      I understand that there are a lot of columns in this record.

      NA06984-SRR006041.1145152
      1040
      1
      113040605
      57
      325M
      *
      0
      0
      TTGATCACTTCACACACATCTTCATCGATGAGGCTGGCCACTGCATGGAGCCTGAGAGTCTGGTAGCTATAGCAGGTGAGGGACTCAGGTGGGGCTGCAGGTATACACCCTGTGTGGGTCAGAGAGGTTGCACCACTTACCTTTCTTCCCACACCTCTTCTGCTTCCCAGGGCTGATGGAAGTAAAGGAAACAGGTGATCCAGGAGGGCAGCTGGTGCTGGCAGGAGACCCTCGGCAGCTGGGGCCTGTGCTGCGTTCCCCACTGACCCAGAAGCATGGACTGGGATACTCACTGCTGGAGCGGCTGCTCACCTACAACTCCCTG
      799::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::88:::::::::::;;;;;;;;;;;;;::888:;;;;;;;;;;;;;;;;;;;;;;;;;;;;888;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;:9::;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;::::;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;:::::::::::::::::::::::::
      RG:Z:SRR006041
      NM:i:0

      I'd like to know what they mean. I do have faint ideas but I'd like to know about it anyways.

      Comment

      • krobison
        Senior Member
        • Nov 2007
        • 734

        #4
        You'll get much better answers if you post specific questionswhich can't be easily found in the SAM format documentation.

        Comment

        • Joker!sAce
          Member
          • Feb 2011
          • 21

          #5
          My study involves divergence study on the gene p53 on short arm of chromosome 17. I need to extract this part of the sequence.

          I understand that I can do this in two ways:
          1. Get raw fasta reads.
          2. Extract from the aligned(to hg18) data(in BAM format).

          How do I do it the 2'nd part?

          Comment

          • krobison
            Senior Member
            • Nov 2007
            • 734

            #6
            If you know the chromosomal coordinates for your gene (which you can find in the UCSC files or via the browser), then SAMtools can extract this efficiently

            Comment

            • Joker!sAce
              Member
              • Feb 2011
              • 21

              #7
              This sequence has been aligned to hg18. I know the chromosomal co-ordinates for hg18 (chr17:7,520,037-7,531,588 - That's the tp53 repressor gene)

              How do I proceed from here?

              Comment

              • krobison
                Senior Member
                • Nov 2007
                • 734

                #8
                samtools view aligned.bam chr17:7520037-7531588 > tp53.sam

                Comment

                Latest Articles

                Collapse

                • GATTACAT
                  Reply to Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                  by GATTACAT
                  Love this - good data definitely starts from good input, and poor input can only give relatively poor data. I particularly like the mention of Nanodrop/absorbance based methods for quantification. It's such a toss up if you'll get an accurate reading or what amounts to a randomly generated number, and a lot of library/sequencing related issues can be traced back to poor quant.
                  07-01-2026, 11:43 AM
                • SEQadmin2
                  Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                  by SEQadmin2


                  I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                  Here are nine questions we think about, in roughly the order they matter, before...
                  06-18-2026, 07:11 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by SEQadmin2, Yesterday, 11:08 AM
                0 responses
                6 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-30-2026, 05:37 AM
                0 responses
                11 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-26-2026, 11:10 AM
                0 responses
                19 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-17-2026, 06:09 AM
                0 responses
                53 views
                0 reactions
                Last Post SEQadmin2  
                Working...