Seqanswers Leaderboard Ad

**Wallysb01** · 07-25-2012, 03:46 PM

I'm wondering if this is the product of alternative splicing in your annotation/sequencing set. For example, maybe an alternative 3' terminal exon could lead something being called as upstream? Or a skipped exon, could lead to an intronic call?

I'd overlay your SNP calls with the annotation in something like IGV and see if you can visualize what might be the reason.

**shyam_la** · 07-25-2012, 03:53 PM

Thanks for responding.
That doesn't make sense to me really..
Everything is already aligned to the reference genome. Base 12345678 is going to be intronic or exonic, and hence mutation at base 12345678 is going to be either intronic or exonic respectively, irrespective of how different isoforms are spliced. Isn't it??
Alternative splicing will affect which exons are there in the protein, but can't affect where exactly a particular aligned base position falls in the genome structure, right?
Could I be doing something wrong post-mutation calling that is leading to this effect?

**Wallysb01** · 07-25-2012, 04:33 PM

You're right about whether or not a mutation at a specific base in exon or intron should be irrespective of the isoform, assuming everything is working as you think it is and being treated consistently. However, what I'm wondering is if the annotation files and the sequencing all had the same isoforms annotated, or even if the programs are handling these annotation files equivalently. That's why I'd say just go look at it in IGV. If you can visually see nothing but exon/splicing SNPs, you'll know its a problem with how these programs are calling SNPs relative to the annotation files. Then if you visualize SNPs outside the exon/splicing regions, then you know its something wrong with your initial screen.

**shyam_la** · 07-25-2012, 04:41 PM

Sounds like an idea! Will update asap..
Thanks.
One question: Can IGV visualise just a list of chr and base positions? I hve one column with chromosomes and one column with base positions on the chromosome (I have 2 more columns with reference allele and observed allele, but those are irrelevant for the purposes of our discussion)..

**Wallysb01** · 07-25-2012, 04:49 PM

I'm note sure if IGV could load what you want. Here's a list of the supported file formats: http://www.broadinstitute.org/software/igv/FileFormats

If you can convert your data to VCF, that would work? You might have to do some file format manipulation to get it working. Or maybe find a more flexible viewer.

**shyam_la** · 07-25-2012, 04:59 PM

Well, I just picked 10 random spots from the list and tested them individually.. Half the time the annotation is correct (comparing to IGV), half the time its not..

I can't draw any conclusions yet..

What I am thinking is the UCSC refgene bed file, the refgene set of annovar and the refgene set used by IGV are all different. Is that possible?

**Wallysb01** · 07-25-2012, 05:03 PM

It is certainly possible, especially when it comes to chromosome naming schemes. You should try to standardize on one set. Which can sound easy, but often isn't.

**shyam_la** · 07-25-2012, 05:17 PM

I tested a few more loci. Annovar and IGV compare well, on intron vs exon, but not so well when its UTR5/UTR3 vs exon.. Some UTR annotated sites fall within IGV exons..

I don't think standardization of the kind you are talking about is even possible. Only way to do it is if I somehow get a bed file that is exactly the same as the annovar annotation set or conversely get annovar to somehow make use of the bed file for its annotation set.. Any experience doing that?

**Wallysb01** · 07-25-2012, 05:24 PM

Hmm, sounds like its just the additional layer of information that is causing the confusion (exons can be coding or UTR, but IGV stops at the exon level). Without knowing more about the file formats you're using its hard for me to say what is best. Are the annotation files just gtf/gff3s that need converting to bed? If so, that's pretty straight forwarding using a number of tools you could google (genome annotation programs often have these converters as part of their source code).

If you can give me more information and maybe the first 10 lines of the files you're using I could try to make some sense of it.

**shyam_la** · 07-25-2012, 05:37 PM

They are actually in standard UCSC refgene txt fomat. If you go to table browser, and try to export all fields in selected table as plain text, thats the kind of file. But the file that annovar uses is different (thats the working explanation now) from the file on the UCSC browser currently..
I think I have figured a way out already. Involves getting the file from the browser right now, replacing the original annovar file and using "retrieve_seq_from_fasta.pl" that comes with annovar. Will update if that solves issue..

**shyam_la** · 07-25-2012, 06:37 PM

Script doesn't work really. Still stuck with no solution in sight..

**thedavid** · 07-26-2012, 01:12 PM

Is it a strand issue? I've made that mistake before......

**shyam_la** · 07-26-2012, 01:16 PM

Can you please explain?

**thedavid** · 07-26-2012, 01:25 PM

Sorry, first let me start off by saying it's possible I've totally mis-understood your issue.
Second, if you fail to deal with the strand of your features (ie is the gene on the positive or negative strand; or in other, words Watson or Crick) you can screw up how you map the coordinates back to your data.

Topics	Statistics	Last Post
Expanding the Horizons of Cellular Research with the Single Cell Atlas by seqadmin Started by seqadmin, Yesterday, 11:49 AM	0 responses 13 views 0 likes	Last Post by seqadmin Yesterday, 11:49 AM
Genetic Variants and Diabetes Risk in Childhood Cancer Survivors by seqadmin Started by seqadmin, 04-24-2024, 08:47 AM	0 responses 16 views 0 likes	Last Post by seqadmin 04-24-2024, 08:47 AM
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 61 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 60 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM

Seqanswers Leaderboard Ad

Announcement

Question about exon limits and annotation..

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News