Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Depth of Coverage

    I am using the following command to do depthofCoverage

    java -jar /Software/GenomeAnalysisTK.jar -I test_R1R2aln.sam.readgrporg_ordered.bam --outputFormat csv -o test_Coverage_summary -T DepthOfCoverage -R /Run1/H.Sapiens/ucsc.hg19.fasta -L testexonbed.list

    where -I test_R1R2aln.sam.readgrporg_ordered.bam is aligned R1 and R2 for all samples.

    -L testexonbed.list is list of intervals

    my output in Coverage_summary comes out like this

    Locus Total_Depth Average_Depth_sample Depth_for_sample1
    chrx:00004900 34 34.00 34
    chrx:00004909 34 34.00 34
    chrx:00004910 34 34.00 34
    chrx:00004911 34 34.00 34
    chrx:00004912 34 34.00 34
    chrx:00004913 34 34.00 34
    chrx:00004914 34 34.00 34

    1- What does it mean
    2- How do I know what sample this data is referring to
    3- How do I know my forward and reverse reads

    another output file sample_interval_summary looks like this

    Target total_coverage average_coverage sample1_total_cvg sample1_mean_cvg sample1_granular_Q1 sample1_granular_median sample1_granular_Q3 sample1_%_above_15
    chrx:00004900-00005100 9690 46.81 9690 46.81 51 51 51 100.0
    chrx:00006100-00006200 7420 140.00 7420 140.00 141 141 141 100.0
    chrx:00006800-00007000 10660 65.00 10660 65.00 5 87 87 75.0
    chrx:00007000-00007200 23606 159.50 23606 159.50 149 153 153 100.0

  • #2
    Another question I have which is related to my previous questions is if I want to calculate depth of coverage on multiple samples which are contained in one bam file can I give GATK one file and it will do Depth of coverage by sample. I feel like I need to give it a sample information file but can't find the option for it.

    Comment


    • #3
      Hi Viberance,

      It's a little hard to tell from your output I agree, especially because there's only one sample so all the numbers are the same.

      Here's one of mine:

      Locus Total_Depth Average_Depth_sample Depth_for_Sample1
      1:10385451 2751 250.09 144

      1- What does it mean:
      Locus = the genomic coordinate
      Total_Depth = Cumulative depth across all samples, I had 12
      Average_Depth_sample = Average depth at this position from all samples
      Depth_for_sample1 = The depth for that sample

      2- How do I know what sample this data is referring to:
      It will name them based on the names of the BAMs, if you expected more than one you probably need to work on the format of you list of input BAMs

      3- How do I know my forward and reverse reads:
      You need a different tool - correct me if I'm wrong

      4- multiple samples which are contained in one bam file can I give GATK one file and it will do Depth of coverage by sample:
      I've never used multiple samples per BAM but from what I understand if the BAM if formatted right then it shouldn't be a problem. If you can't get it to work, split your BAM into multiple BAMs, one per sample

      Hope that helps
      LM

      Comment


      • #4
        Hi Shimbalama

        I think the problem is in my bam file, the way I created it, I have about 80 samples in R1.fastq.gz and R2.fastq.gz format. I did cat on these fastq.gz files and made one all80R1.fastq.gz and all80R2.fastq.gz.

        I used BWA mem to create one bam file from these paired end files (all80R1.fastq.gz and all80R2.fastq.gz), but when I go further down in my analysis it doesn't show the samples. Every thing appears like my previous post for output in Coverage_summary.

        Thanks for the help

        Comment


        • #5
          Thinking the problem is in my bam file I want to run BWA mem on individual samples. I have R1 and R2 reads in fastq.gz format I want to run BWA mem paired end parallel on all the files once finished each R1 and R2 complementary file should produce one sam file. Right now I am making two sam file from the two reads

          This is what I have come up with but it’s not doing what I need it to do

          for i in find -maxdepth 2 -iname *fastq.gz -type f; do echo "bwa mem -t 12 /H.Sapiens/ucsc.hg19.fasta ${i}_R1_001.fastq.gz ${i}_R2_001.fastq.gz > ${i}_R1_R2.sam"; done

          when it runs it looks like this

          bwa mem -t 12 /H.Sapiens/ucsc.hg19.fasta ./Sample_0747/0747_CGG_L001_R2_001.fastq.gz_R1_001.fastq.gz ./Sample_0747/0747_CGG_L001_R2_001.fastq.gz_R2_001.fastq.gz > ./Sample_0747/0747_CGG_L001_R2_001.fastq.gz_R1_R2.sam

          bwa mem -t 12 H.Sapiens/ucsc.hg19.fasta ./Sample_0748/0748_CCA_L001_R1_001.fastq.gz_R1_001.fastq.gz ./Sample_0748/0748_CCA_L001_R1_001.fastq.gz_R2_001.fastq.gz > ./Sample_0748/0748_CCA_L001_R1_001.fastq.gz_R1_R2.sam
          -bash-4.1$
          I understand the problem is in iname but how do I fixit?
          Thank you so much

          Comment

          Latest Articles

          Collapse

          • seqadmin
            Strategies for Sequencing Challenging Samples
            by seqadmin


            Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
            03-22-2024, 06:39 AM
          • seqadmin
            Techniques and Challenges in Conservation Genomics
            by seqadmin



            The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

            Avian Conservation
            Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
            03-08-2024, 10:41 AM

          ad_right_rmr

          Collapse

          News

          Collapse

          Topics Statistics Last Post
          Started by seqadmin, 03-27-2024, 06:37 PM
          0 responses
          12 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-27-2024, 06:07 PM
          0 responses
          11 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-22-2024, 10:03 AM
          0 responses
          52 views
          0 likes
          Last Post seqadmin  
          Started by seqadmin, 03-21-2024, 07:32 AM
          0 responses
          68 views
          0 likes
          Last Post seqadmin  
          Working...
          X