Hello,
I'm trying to analyze my first result of exome sequencing and am having some problems.
I ran one cloud analaysis (almost the default that I'm was reading here: FastQC, ngsqctookkit, bwa, samtools, Picard, GATK, NGSrich, ANNOVAR and Wesparser) via WEP's site. Although the software had informed me that the result was 200x coverage (I believe it had only considered the number of nucleotides sequenced divided by size of exome - ~ 6,000,000) some statistics not reported the same thing.
The first impression of File FastQC (file1.pdf attached) was good, high phred, many reads etc, however its gave two flags: GC content and sequence duplication levels. What is the real impact of this second statistics?
The Performance of Sample Enrichment file file3 (file2.pdf attached) told me that just 36.75% of the exons had coverage over than 30x. In the same file exists a table with several genes that were not covered. I would like some help, if these results are correct, the analysis may have been done wrong... in summary what can I do? By my calculations ~ 8% of total genes were not covered. With all this, I'm concerned about the confidence of my results.
I appreciate your attention.
I'm trying to analyze my first result of exome sequencing and am having some problems.
I ran one cloud analaysis (almost the default that I'm was reading here: FastQC, ngsqctookkit, bwa, samtools, Picard, GATK, NGSrich, ANNOVAR and Wesparser) via WEP's site. Although the software had informed me that the result was 200x coverage (I believe it had only considered the number of nucleotides sequenced divided by size of exome - ~ 6,000,000) some statistics not reported the same thing.
The first impression of File FastQC (file1.pdf attached) was good, high phred, many reads etc, however its gave two flags: GC content and sequence duplication levels. What is the real impact of this second statistics?
The Performance of Sample Enrichment file file3 (file2.pdf attached) told me that just 36.75% of the exons had coverage over than 30x. In the same file exists a table with several genes that were not covered. I would like some help, if these results are correct, the analysis may have been done wrong... in summary what can I do? By my calculations ~ 8% of total genes were not covered. With all this, I'm concerned about the confidence of my results.
I appreciate your attention.
Comment