Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Downsampling a BAM file

    Hi ChIP-seq experts

    I'm a newbie in the field of ChIP-seq data mining and need some help! I've sequenced several samples and mapped them using bowtie and everything looks fine so fare

    But they differ somewhat in sequence depth which makes them difficult to compare - until now I've used coverageBed (bedtools) to find the read coverage around TSS and in my peak regions and then normalized the read count in these regions to sequence depth.

    But for some of my future analysis it would be really nice if the BAM file was normalized to sequence depth - simply, I want to remove some random reads from one sample so it has the same amount of reads as my second sample... I've found that picard "DownsampleSam" should be able to do this, however I cannot get the programme to work on my (mac) computer.

    I hope someone can help!!

    BR, Kathrine

  • #2
    Discussion of next-gen sequencing related bioinformatics: resources, algorithms, open source efforts, etc


    If you have no headers or you convert to BED the attached perl script should work as well. Warning: I'm not really sure what the script does, but it seems to work.
    Attached Files
    --------------
    Ethan

    Comment


    • #3
      You could use bamtools random for this as well.
      What errors is Picard giving you?

      Comment


      • #4
        Thank you so much for your input!!

        I've tried the code, but i doesn't seem to work - I'm not that much up for converting it to a bed file as I need the BAM format later on.

        I'e tried bamtools random, but keep on getting the same error message

        bamtools random ERROR: could not load index data for all input BAM file(s)... Aborting.

        My code line is as follows:

        bamtools random -in Input_file.bam -out output_reduced.bam -n 1000000

        The input bam file originates from the SAM file produced when mapping with bowtie - it is converted to BAM with "samtools view", and sorted with "samtools sort" - then I extract all mapped reads with "samtools view -b -F 4"

        As you might can imagine I'm a newbie in this field and all help is very much appreciated!

        Comment


        • #5
          Regarding the Picard errors - I think it relates to the (mac) version of my java (which otherwise is up to date):

          Exception in thread "main" java.lang.NoClassDefFoundError: jvm-argsCaused by: java.lang.ClassNotFoundException: jvm-args
          at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
          at java.security.AccessController.doPrivileged(Native Method)
          at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
          at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
          at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
          at java.lang.ClassLoader.loadClass(ClassLoader.java:247)

          Comment


          • #6
            You need to index the bam file first. You can do this with bamtools:

            bamtools index -in Input_file.bam

            Comment


            • #7
              Originally posted by KathrineBL View Post
              Regarding the Picard errors - I think it relates to the (mac) version of my java (which otherwise is up to date):

              Exception in thread "main" java.lang.NoClassDefFoundError: jvm-args
              [...]
              You're using this as a template, I'm guessing:
              java jvm-args -jar PicardCommand.jar OPTION1=value1 OPTION2=value2...

              Like the command and the options, jvm-args needs to be substituted for actual java arguments. A typical example is -Xmx2g (specifying 2G of memory allocated for the run)

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM
              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 06:37 PM
              0 responses
              11 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 06:07 PM
              0 responses
              10 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-22-2024, 10:03 AM
              0 responses
              51 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-21-2024, 07:32 AM
              0 responses
              68 views
              0 likes
              Last Post seqadmin  
              Working...
              X