SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
snpEFF Database building error parimaladevi Bioinformatics 2 03-17-2016 09:08 PM
snpEff error sinnamone Bioinformatics 3 05-14-2014 11:05 PM
snpeff mmmm Bioinformatics 3 04-04-2014 06:54 AM
SNPEff error message! shyam_la Bioinformatics 1 06-13-2012 10:34 AM
snpEff error fc35802 Bioinformatics 19 02-22-2012 12:31 AM

Reply
 
Thread Tools
Old 05-26-2014, 12:58 PM   #1
bongbimit
Member
 
Location: Ha Noi

Join Date: Dec 2013
Posts: 29
Default snpeff error?

Dear all, i am using SnpEff to annotate my VCF file but i see some error

i use hg19 of Snpeff database ( i didn't build) to annotate my vcf file

here is the result

java -Xmx10g -jar snpEff.jar eff -c snpEff.config -v hg19 accepted_hits-snp-dbsnp.vcf > accepted_hits-snp-dbsnp-eff.vcf
00:00:00.000 Reading configuration file '../../../../../tools/snpeff_snpsift/snpEff/snpEff.config'. Genome: 'hg19'
00:00:00.302 done
00:00:00.302 Reading database for genome version 'hg19' from file '/home/huypham/tools/snpeff_snpsift/snpEff/./data/hg19/snpEffectPredictor.bin' (this might take a while)
00:00:07.050 done
00:00:07.081 Building interval forest
00:00:15.896 done.
00:00:15.896 Genome stats :
# Genome name : 'Homo_sapiens (USCS)'
# Genome version : 'hg19'
# Has protein coding info : true
# Genes : 26346
# Protein coding genes : 20775
# Transcripts : 47313
# Avg. transcripts per gene : 1.80
# Protein coding transcripts : 38612
# Length errors : 210 ( 0.54% )
# STOP codons in CDS errors : 193 ( 0.50% )
# START codon errors : 14 ( 0.04% )
# STOP codon warnings : 0 ( 0.00% )
# Total Errors : 239 ( 0.62% )

# Cds : 388149
# Exons : 459403
# Exons with sequence : 459403
# Exons without sequence : 0
# Avg. exons per transcript : 9.71
# Number of chromosomes : 93
# Chromosomes names [sizes] :
...

NEW VERSION!
There is a new SnpEff version available:
Version : 3.6
Release date : 2014-04-21
Download URL : http://sourceforge.net/projects/snpe...atest_core.zip


how can i explain about the error which i bold?
thank you.
bongbimit is offline   Reply With Quote
Old 08-06-2014, 03:10 AM   #2
ebioman
Member
 
Location: Switzerland

Join Date: Aug 2013
Posts: 41
Default

Hello
These stem from the database which (I guess) is in your case provided by SnpEff.
If I recall correctly SNPeff updates from time to time their database based on ENSEMBL. Now it all depends how well the databse there is curated.
These databases will never be flawless since many predicted protein coding genes derive still from automated annotation tools such as e.g. Augustus or SNAP.
Since SNPeff contains as well the protein length information it will detect whether there are differences between the predicted length according to the CDS and the final protein.
Somehow you have a few genes included which have e.g. a stop codon within the CDS or some transcripts which are incomplete (START codon errors and maybe as well Length errors). E.g. Augustus is capable to predict as well incomplete proteins (for example close to gaps or the end/start of scaffolds) and might generate a protein without a start-codon.

Cheers
ebioman is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 08:29 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO