Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • nilmot13
    Member
    • Jan 2011
    • 19

    Pileup from BAM or SAM

    Hi,

    Currently I'm trying to solve a simple problem for bioinformatician, but being a experimental biologist it's proving to be very tricky for me.

    At the moment I've obtained mapped reads in SAM format. I want to determine the sequencing coverage in term of number of times sequenced per base. I know I can use pileup function in samtool however it does not give strand specific information.

    Are there any way of counting the bases per strand?

    Or perhaps handling the pileup dataset in Galaxy to obtain strand specific count?

    Many help would be great thank you.
  • kmcarr
    Senior Member
    • May 2008
    • 1181

    #2
    Originally posted by nilmot13 View Post
    Hi,

    Currently I'm trying to solve a simple problem for bioinformatician, but being a experimental biologist it's proving to be very tricky for me.

    At the moment I've obtained mapped reads in SAM format. I want to determine the sequencing coverage in term of number of times sequenced per base. I know I can use pileup function in samtool however it does not give strand specific information.

    Are there any way of counting the bases per strand?

    Or perhaps handling the pileup dataset in Galaxy to obtain strand specific count?

    Many help would be great thank you.
    The better tool for this job is the BEDTools Suite, a collection of programs for manipulating genomic range/alignment data. For your particular problem the tool to use is genomeCoverageBed. You provide as input your sorted BAM file and a genome file. The genome file is simply a text file listing the name of each chromosome and its size in bp (see the documentation). The program has options to calculate either global coverage parameters or per base depth. It also has an option to calculate strand specific coverage.

    Comment

    • nilmot13
      Member
      • Jan 2011
      • 19

      #3
      Dear kmcarr,

      That's sound exactly like what I need. Thank you every so much, I'll try it out!!

      Comment

      • nilmot13
        Member
        • Jan 2011
        • 19

        #4
        Originally posted by kmcarr View Post
        The better tool for this job is the BEDTools Suite, a collection of programs for manipulating genomic range/alignment data. For your particular problem the tool to use is genomeCoverageBed. You provide as input your sorted BAM file and a genome file. The genome file is simply a text file listing the name of each chromosome and its size in bp (see the documentation). The program has options to calculate either global coverage parameters or per base depth. It also has an option to calculate strand specific coverage.
        Hi kmcarr,

        I've managed to get it to work.

        If you have the time do you mind helping me with a couple of more question with regards to BEDTools?

        I've setup test file to run the genomeCoverageBed script. Is it possible to output the report as a file? At the moment I'm using this command.

        $ genomeCoverageBed -ibam <test file.bam> -bga -strand -g <genome file.genome>

        Many thanks!

        Comment

        • nilmot13
          Member
          • Jan 2011
          • 19

          #5
          Other confusions

          I think i've worked it out:

          by doing "command as before" > filename.file

          However, the coverage information is slighly confusing?
          I was expecting count per base with strand specificity, but I think it's giving me coverage percentage with respect to entire genome?

          Comment

          • kmcarr
            Senior Member
            • May 2008
            • 1181

            #6
            When using the -strand option you must follow that with either a '+' or '-' to indicate which strand to report. The output data will be limited to coverage on the strand you indicated so you will need to run the command twice, once for each strand.

            Using the -bga option would not report fractions. It outputs a standard bedGraph format file, reporting intervals of coverage at a particular depth with a new interval started when the depth changes. For example:

            Code:
            The columns are:
            Chrom  startPos endPos depth
            
            Chr1	3658	3684	1
            Chr1	3684	3693	2
            Chr1	3693	3733	3
            Chr1	3733	3759	2
            Chr1	3759	3763	1
            Chr1	3763	3768	3
            Chr1	3768	3773	2
            Chr1	3773	3838	3
            Chr1	3838	3848	1
            Chr1	4008	4055	1
            Chr1	4055	4077	2
            Chr1	4077	4083	3
            Chr1	4083	4119	2
            Chr1	4119	4125	3
            Chr1	4125	4130	4
            Chr1	4130	4139	3
            Chr1	4139	4144	4
            Chr1	4144	4152	5
            Chr1	4152	4154	4
            If you want to report the depth at each and every position of the genome you should use the -d option instead of -bg(a).

            Comment

            • nilmot13
              Member
              • Jan 2011
              • 19

              #7
              Thank, I've just setup a test file. It works fine now with the +/- sign after the -strand command.

              Thanks every so much.

              Comment

              • nilmot13
                Member
                • Jan 2011
                • 19

                #8
                Hi,

                A very quick question I don't know if you have encountered this before

                during the run of the genomeCoverageBed scipt the output resulted in this error message

                BGZF ERROR: read block failed - could not read data from block

                I have sorted my bam file beforehand and have check the total chromosome length is correct.
                Can anyone suggest potential cause of error?

                Thank you very much

                Comment

                Latest Articles

                Collapse

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by SEQadmin2, Today, 10:09 AM
                0 responses
                9 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, Yesterday, 08:59 AM
                0 responses
                16 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-02-2026, 12:03 PM
                0 responses
                24 views
                0 reactions
                Last Post SEQadmin2  
                Started by SEQadmin2, 06-02-2026, 11:40 AM
                0 responses
                21 views
                0 reactions
                Last Post SEQadmin2  
                Working...