Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to get a report like stuff of a bam file how many percent of the exons are cover

    Dear Collegues,

    Lets say I have a miseq run and have the .bam file from the squencer and I would like to know how many percent of the exons (specific genes) are coverred in these .bam files?

    Is it possible?

    Thanks in advance

  • #2
    Try CollectHSmetrics - part of Picard Tools
    For details see:

    Comment


    • #3
      Originally posted by Gopo View Post
      Try CollectHSmetrics - part of Picard Tools
      For details see:
      https://broadinstitute.github.io/pic...-overview.html
      So how coul I be able to get a BED file or prepare it by myself? Could you give me a clue or a tutorial of this. Cause I have trouble on this. I know this is hard to expalin it here but can I get any file or manual for this?

      Comment


      • #4
        So, you will have to generate the BED file yourself based on the coordinates of the genes or exons that you are interested in.

        Code:
        # generate fasta index for genome (genome is AmexG_v3.0.0.fa)
        samtools faidx AmexG_v3.0.0.fa
        
        # create sequence dictionary
        java -Xmx64g -jar ~/bin/picard-2.18.10.jar CreateSequenceDictionary \
        R=AmexG_v3.0.0.fa \
        O=AmexG_v3.0.0.dict
        
        # convert Bed to interval list
        java -jar ~/bin/picard-2.18.10.jar BedToIntervalList \
        I=rfs.immunome.bed \
        O=rfs.immunome.interval.list \
        SD=AmexG_v3.0.0.dict
        
        # run CollectHsMetrics
        java -Xmx64g -jar ~/bin/picard-2.18.10.jar CollectHsMetrics \
        BAIT_INTERVALS=/ssdwork/jelber2/rfs/rfs.immunome.interval.list \
        BAIT_SET_NAME=Immunome \
        TARGET_INTERVALS=/ssdwork/jelber2/rfs/rfs.immunome.interval.list \
        METRIC_ACCUMULATION_LEVEL=SAMPLE \
        R=/ssdwork/jelber2/rfs/AmexG_v3.0.0.fa \
        I=ALL-samples.bam \
        O=ALL-samples-coverage-metrics.txt
        # note that you might have to add readgroups to the file ALL-samples.bam (whatever your BAM file is named) with

        Code:
        java -Xmx64g -jar ~/bin/picard.jar AddOrReplaceReadGroups \
            I=ALL-samples-recal-no-read-groups.bam \
            O=ALL-samples-recal-no-read-groups-for-callableloci.bam \
            SORT_ORDER=coordinate \
            RGPL=illumina \
            RGPU=barcode \
            RGLB=Lib1 \
            RGID=all \
            RGSM=all \
            VALIDATION_STRINGENCY=LENIENT

        Comment


        • #5
          Code:
          # install picard
          cd ~/bin/
          wget https://github.com/broadinstitute/picard/releases/download/2.18.10/picard.jar
          mv picard.jar picard-2.18.10.jar
          
          # reference genome is AmexG_v3.0.0.fa
          # generate fasta index
          samtools faidx AmexG_v3.0.0.fa
          
          # create sequence dictionary
          java -Xmx64g -jar ~/bin/picard-2.18.10.jar CreateSequenceDictionary \
          R=AmexG_v3.0.0.fa \
          O=AmexG_v3.0.0.dict
          
          # convert bed to interval list
          java -jar ~/bin/picard-2.18.10.jar BedToIntervalList \
          I=rfs.immunome.bed \
          O=rfs.immunome.interval.list \
          SD=AmexG_v3.0.0.dict
          
          # run CollectHsMetrics
          java -Xmx64g -jar ~/bin/picard-2.18.10.jar CollectHsMetrics \
          BAIT_INTERVALS=/ssdwork/jelber2/rfs/rfs.immunome.interval.list \
          BAIT_SET_NAME=Immunome \
          TARGET_INTERVALS=/ssdwork/jelber2/rfs/rfs.immunome.interval.list \
          METRIC_ACCUMULATION_LEVEL=SAMPLE \
          R=/ssdwork/jelber2/rfs/AmexG_v3.0.0.fa \
          I=ALL-samples.bam \
          O=ALL-samples-coverage-metrics.txt
          
          # if you need to add read groups
          java -Xmx64g -jar ~/bin/picard-2.18.10.jar AddOrReplaceReadGroups \
          I=ALL-samples.bam \
          O=ALL-samples.RG.bam \
          SORT_ORDER=coordinate \
          RGPL=illumina \
          RGPU=barcode \
          RGLB=Lib1 \
          RGID=all \
          RGSM=all \
          VALIDATION_STRINGENCY=LENIENT
          
          java -Xmx64g -jar ~/bin/picard-2.18.10.jar CollectHsMetrics \
          BAIT_INTERVALS=/ssdwork/jelber2/rfs/rfs.immunome.interval.list \
          BAIT_SET_NAME=Immunome \
          TARGET_INTERVALS=/ssdwork/jelber2/rfs/rfs.immunome.interval.list \
          METRIC_ACCUMULATION_LEVEL=SAMPLE \
          R=/ssdwork/jelber2/rfs/AmexG_v3.0.0.fa \
          I=ALL-samples.RG.bam \
          O=ALL-samples-coverage-metrics.txt

          Comment


          • #6
            [CODE]
            # install picard
            cd ~/bin/
            wget https://github.com/broadinstitute/picard/releases/download/2.18.10/picard.jar
            mv picard.jar picard-2.18.10.jar

            # reference genome is AmexG_v3.0.0.fa
            # generate fasta index
            samtools faidx AmexG_v3.0.0.fa

            # create sequence dictionary
            java -Xmx64g -jar ~/bin/picard-2.18.10.jar CreateSequenceDictionary \
            R=AmexG_v3.0.0.fa \
            O=AmexG_v3.0.0.dict

            # convert bed to interval list
            java -jar ~/bin/picard-2.18.10.jar BedToIntervalList \
            I=rfs.immunome.bed \
            O=rfs.immunome.interval.list \
            SD=AmexG_v3.0.0.dict

            # run CollectHsMetrics
            java -Xmx64g -jar ~/bin/picard-2.18.10.jar CollectHsMetrics \
            BAIT_INTERVALS=/ssdwork/jelber2/rfs/rfs.immunome.interval.list \
            BAIT_SET_NAME=Immunome \
            TARGET_INTERVALS=/ssdwork/jelber2/rfs/rfs.immunome.interval.list \
            METRIC_ACCUMULATION_LEVEL=SAMPLE \
            R=/ssdwork/jelber2/rfs/AmexG_v3.0.0.fa \
            I=ALL-samples.bam \
            O=ALL-samples-coverage-metrics.txt

            # if you need to add read groups
            java -Xmx64g -jar ~/bin/picard-2.18.10.jar AddOrReplaceReadGroups \
            I=ALL-samples.bam \
            O=ALL-samples.RG.bam \
            SORT_ORDER=coordinate \
            RGPL=illumina \
            RGPU=barcode \
            RGLB=Lib1 \
            RGID=all \
            RGSM=all \
            VALIDATION_STRINGENCY=LENIENT

            java -Xmx64g -jar ~/bin/picard-2.18.10.jar CollectHsMetrics \
            BAIT_INTERVALS=/ssdwork/jelber2/rfs/rfs.immunome.interval.list \
            BAIT_SET_NAME=Immunome \
            TARGET_INTERVALS=/ssdwork/jelber2/rfs/rfs.immunome.interval.list \
            METRIC_ACCUMULATION_LEVEL=SAMPLE \
            R=/ssdwork/jelber2/rfs/AmexG_v3.0.0.fa \
            I=ALL-samples.RG.bam \
            O=ALL-samples-coverage-metrics.txt
            [\CODE]

            Comment


            • #7
              You will have to make the BED file yourself. Here is a guide:

              Code:
              # install picard
              cd ~/bin/
              wget https://github.com/broadinstitute/picard/releases/download/2.18.10/picard.jar
              mv picard.jar picard-2.18.10.jar
              
              # index reference (Reference is AmexG_v3.0.0.fa)
              samtools faidx AmexG_v3.0.0.fa
              
              # create sequence dictionary
              java -Xmx64g -jar ~/bin/picard-2.18.10.jar CreateSequenceDictionary \
              R=AmexG_v3.0.0.fa \
              O=AmexG_v3.0.0.dict
              
              # Convert BED to interval list
              java -jar ~/bin/picard-2.18.10.jar BedToIntervalList \
              I=rfs.immunome.bed \
              O=rfs.immunome.interval.list \
              SD=AmexG_v3.0.0.dict
              
              # run CollectHsMetrics
              java -Xmx64g -jar ~/bin/picard-2.18.10.jar CollectHsMetrics \
              BAIT_INTERVALS=/ssdwork/jelber2/rfs/rfs.immunome.interval.list \
              BAIT_SET_NAME=Immunome \
              TARGET_INTERVALS=/ssdwork/jelber2/rfs/rfs.immunome.interval.list \
              METRIC_ACCUMULATION_LEVEL=SAMPLE \
              R=/ssdwork/jelber2/rfs/AmexG_v3.0.0.fa \
              I=ALL-samples.bam \
              O=ALL-samples-coverage-metrics.txt
              
              # if needed, add readgroups
              java -Xmx64g -jar ~/bin/picard.jar AddOrReplaceReadGroups \
              I=ALL-samples.bam \
              O=ALL-samples-RG.bam \
              SORT_ORDER=coordinate \
              RGPL=illumina \
              RGPU=barcode \
              RGLB=Lib1 \
              RGID=all \
              RGSM=all \
              VALIDATION_STRINGENCY=LENIENT
              
              # run CollectHsMetrics with ReadGroups added to BAM
              java -Xmx64g -jar ~/bin/picard-2.18.10.jar CollectHsMetrics \
              BAIT_INTERVALS=/ssdwork/jelber2/rfs/rfs.immunome.interval.list \
              BAIT_SET_NAME=Immunome \
              TARGET_INTERVALS=/ssdwork/jelber2/rfs/rfs.immunome.interval.list \
              METRIC_ACCUMULATION_LEVEL=SAMPLE \
              R=/ssdwork/jelber2/rfs/AmexG_v3.0.0.fa \
              I=ALL-samples-RG.bam \
              O=ALL-samples-coverage-metrics.txt
              Best,
              Gopo
              Last edited by Gopo; 08-08-2018, 10:12 PM. Reason: wrong syntax for displaying code

              Comment


              • #8
                This tools helps me create bed files from a gene list:


                Choose bed format while downloading.

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM
                • seqadmin
                  Techniques and Challenges in Conservation Genomics
                  by seqadmin



                  The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                  Avian Conservation
                  Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                  03-08-2024, 10:41 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, Yesterday, 06:37 PM
                0 responses
                7 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, Yesterday, 06:07 PM
                0 responses
                7 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-22-2024, 10:03 AM
                0 responses
                49 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 03-21-2024, 07:32 AM
                0 responses
                66 views
                0 likes
                Last Post seqadmin  
                Working...
                X