SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   No coverage or very low coverage in the Complete Genomics data (http://seqanswers.com/forums/showthread.php?t=79925)

raman91 01-02-2018 11:44 PM

No coverage or very low coverage in the Complete Genomics data
 
Hi,

Has anyone worked on Complete Genomics data? Most of the exons in the data have no coverage or very low coverage (5-10 reads only) even though the reads are sequenced at 40X coverage. Can anyone explain why is it so?

Thanks

Regards

Gorgon_VZ 01-03-2018 03:56 AM

Hi! Is it really the exons only, that show reduced coverage and the rest of the genome shows 40x? Have you checked the mean of all exons or for one particular gene only? What kind of genome and data are you working with? In general I would say it's kind of normal to see differences in coverage over the genome, but I am not aware of a exon specific bias. However, if you look on a particular gene or even a gene-family, the coverage might be reduced because of homology regions and multimapped reads with low mapping quality that are filtered out. Maybe it is worth to have a look on the percentage of reads you are able to map against your reference. A high number of discarded sequences could be a hind of such an effect.

raman91 01-09-2018 01:20 AM

1 Attachment(s)
Iím sorry, I didnít explain the issue properly. Iím working on CG WGS data with avg read depth of 40x. The QC metrics looks good for all parameters and the alignment rate is 97.43%. However, I had used cgaTools to convert tsv files provided by CG to BAM. When I visualize these BAM files on IGV, I see minimal coverage at all exons and major parts of introns for all genes (See example image attached). The trend is normally scant coverage hills at junction of intron and exon. I believe something went wrong in my conversion step. Can anyone please suggest a solution for this?
Or is there any other visualization tool specific for CG data that I should be using?Ē

The IGV snapshots is attached herewith.

Thanks in advance

Gorgon_VZ 01-09-2018 10:16 AM

Hi! It could be an issue due to differing genome builds. Ich guess igv is using hg19 and reads are mapped to hg38.

Gorgon_VZ 01-09-2018 10:27 AM

You should be able to Check this by zoom in to nucleotide Level in igv. If the reads Do Not Match the reference this would be a Hind. I am Not a 100% Sure but i believe igv only uses the coordinates and cigar of the bam and does Not care about matching nucleotides. So maybe it is just a slippage of coordinates between hg19 and 38.

raman91 01-10-2018 11:42 PM

Thanks for your reply. I am pretty sure that the reads are mapped to hg19 build only. I checked the IGV browser too and the read sequence match to the reference sequence.


All times are GMT -8. The time now is 12:51 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.