SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Coverage depth vs mutation frequency lbeltrame General 2 07-01-2013 10:49 AM
sequencing depth/coverage Avro1986 General 1 12-12-2012 10:37 AM
Maximum possible coverage depth junfeng Bioinformatics 1 01-18-2012 08:54 AM
About the read depth of coverage El Mariachi Illumina/Solexa 2 12-30-2010 12:22 AM
Very high depth of coverage knott76 Bioinformatics 5 11-19-2009 12:27 AM

Reply
 
Thread Tools
Old 03-12-2015, 05:16 AM   #1
Viberance
Member
 
Location: Alabama

Join Date: Feb 2015
Posts: 13
Default Depth of Coverage

I am using the following command to do depthofCoverage

java -jar /Software/GenomeAnalysisTK.jar -I test_R1R2aln.sam.readgrporg_ordered.bam --outputFormat csv -o test_Coverage_summary -T DepthOfCoverage -R /Run1/H.Sapiens/ucsc.hg19.fasta -L testexonbed.list

where -I test_R1R2aln.sam.readgrporg_ordered.bam is aligned R1 and R2 for all samples.

-L testexonbed.list is list of intervals

my output in Coverage_summary comes out like this

Locus Total_Depth Average_Depth_sample Depth_for_sample1
chrx:00004900 34 34.00 34
chrx:00004909 34 34.00 34
chrx:00004910 34 34.00 34
chrx:00004911 34 34.00 34
chrx:00004912 34 34.00 34
chrx:00004913 34 34.00 34
chrx:00004914 34 34.00 34

1- What does it mean
2- How do I know what sample this data is referring to
3- How do I know my forward and reverse reads

another output file sample_interval_summary looks like this

Target total_coverage average_coverage sample1_total_cvg sample1_mean_cvg sample1_granular_Q1 sample1_granular_median sample1_granular_Q3 sample1_%_above_15
chrx:00004900-00005100 9690 46.81 9690 46.81 51 51 51 100.0
chrx:00006100-00006200 7420 140.00 7420 140.00 141 141 141 100.0
chrx:00006800-00007000 10660 65.00 10660 65.00 5 87 87 75.0
chrx:00007000-00007200 23606 159.50 23606 159.50 149 153 153 100.0
Viberance is offline   Reply With Quote
Old 03-12-2015, 07:11 AM   #2
Viberance
Member
 
Location: Alabama

Join Date: Feb 2015
Posts: 13
Default

Another question I have which is related to my previous questions is if I want to calculate depth of coverage on multiple samples which are contained in one bam file can I give GATK one file and it will do Depth of coverage by sample. I feel like I need to give it a sample information file but can't find the option for it.
Viberance is offline   Reply With Quote
Old 03-12-2015, 04:14 PM   #3
shimbalama
bioinformatics-help.com
 
Location: Adelaide

Join Date: Jul 2014
Posts: 9
Default

Hi Viberance,

It's a little hard to tell from your output I agree, especially because there's only one sample so all the numbers are the same.

Here's one of mine:

Locus Total_Depth Average_Depth_sample Depth_for_Sample1
1:10385451 2751 250.09 144

1- What does it mean:
Locus = the genomic coordinate
Total_Depth = Cumulative depth across all samples, I had 12
Average_Depth_sample = Average depth at this position from all samples
Depth_for_sample1 = The depth for that sample

2- How do I know what sample this data is referring to:
It will name them based on the names of the BAMs, if you expected more than one you probably need to work on the format of you list of input BAMs

3- How do I know my forward and reverse reads:
You need a different tool - correct me if I'm wrong

4- multiple samples which are contained in one bam file can I give GATK one file and it will do Depth of coverage by sample:
I've never used multiple samples per BAM but from what I understand if the BAM if formatted right then it shouldn't be a problem. If you can't get it to work, split your BAM into multiple BAMs, one per sample

Hope that helps
shimbalama is offline   Reply With Quote
Old 03-13-2015, 05:17 AM   #4
Viberance
Member
 
Location: Alabama

Join Date: Feb 2015
Posts: 13
Default

Hi Shimbalama

I think the problem is in my bam file, the way I created it, I have about 80 samples in R1.fastq.gz and R2.fastq.gz format. I did cat on these fastq.gz files and made one all80R1.fastq.gz and all80R2.fastq.gz.

I used BWA mem to create one bam file from these paired end files (all80R1.fastq.gz and all80R2.fastq.gz), but when I go further down in my analysis it doesn't show the samples. Every thing appears like my previous post for output in Coverage_summary.

Thanks for the help
Viberance is offline   Reply With Quote
Old 03-13-2015, 12:36 PM   #5
Viberance
Member
 
Location: Alabama

Join Date: Feb 2015
Posts: 13
Default

Thinking the problem is in my bam file I want to run BWA mem on individual samples. I have R1 and R2 reads in fastq.gz format I want to run BWA mem paired end parallel on all the files once finished each R1 and R2 complementary file should produce one sam file. Right now I am making two sam file from the two reads

This is what I have come up with but it’s not doing what I need it to do

for i in find -maxdepth 2 -iname *fastq.gz -type f; do echo "bwa mem -t 12 /H.Sapiens/ucsc.hg19.fasta ${i}_R1_001.fastq.gz ${i}_R2_001.fastq.gz > ${i}_R1_R2.sam"; done

when it runs it looks like this

bwa mem -t 12 /H.Sapiens/ucsc.hg19.fasta ./Sample_0747/0747_CGG_L001_R2_001.fastq.gz_R1_001.fastq.gz ./Sample_0747/0747_CGG_L001_R2_001.fastq.gz_R2_001.fastq.gz > ./Sample_0747/0747_CGG_L001_R2_001.fastq.gz_R1_R2.sam

bwa mem -t 12 H.Sapiens/ucsc.hg19.fasta ./Sample_0748/0748_CCA_L001_R1_001.fastq.gz_R1_001.fastq.gz ./Sample_0748/0748_CCA_L001_R1_001.fastq.gz_R2_001.fastq.gz > ./Sample_0748/0748_CCA_L001_R1_001.fastq.gz_R1_R2.sam
-bash-4.1$
I understand the problem is in iname but how do I fixit?
Thank you so much
Viberance is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 09:58 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO