Unconfigured Ad

**ulz_peter** · 06-16-2011, 03:49 AM

Quite a lot of questions at once. If you've got a Linux computer in reach try to get some basic command line knowledge (google is your friend)

If you've got your SNPs in anyone of these formats: MAQ, GFF, VCF or CASAVA you could use the SeattleSeq SNP annotation tool (http://gvs.gs.washington.edu/SeattleSeqAnnotation/). That gives you annotations like dbSNP, conservation, HAPMAP frequencies, genes and a lot more. The result is stored as csv file (as far as I remember right) and you can download it and filter the SNPs using Excel.

As the gene names are stored as well you could search for muscle-associated genes in the literature and screen the result for that gene...

I think thats the most easy way of doing it without looking too deeply into Linux (which I actually would recommend anyone dealing with NGS data as long as they got time to do that)

**Maone** · 06-16-2011, 02:26 PM

Originally posted by ulz_peter View Post

Quite a lot of questions at once. If you've got a Linux computer in reach try to get some basic command line knowledge (google is your friend)

If you've got your SNPs in anyone of these formats: MAQ, GFF, VCF or CASAVA you could use the SeattleSeq SNP annotation tool (http://gvs.gs.washington.edu/SeattleSeqAnnotation/). That gives you annotations like dbSNP, conservation, HAPMAP frequencies, genes and a lot more. The result is stored as csv file (as far as I remember right) and you can download it and filter the SNPs using Excel.

As the gene names are stored as well you could search for muscle-associated genes in the literature and screen the result for that gene...

I think thats the most easy way of doing it without looking too deeply into Linux (which I actually would recommend anyone dealing with NGS data as long as they got time to do that)

Thank you very much for your advices.

Actually, I have got the CSV file exported from DNAnexus.com using their nucleotide-level variation with settings as Genome: hg18, Gene annotations: RefSeq Genes.

As taking your advice, I opened the CSV file with Excel and went through the data. Now I get some new questions in interpreting the data:
1. In column of "where_in_transcript", I have CDS, non-coding exon, introns, upstream and downstream, UTRs. If I am only looking for exon mutation, should I look solely in CDS?
2. For some variants, I got duplicates having the same Var_index with the only difference in "transcript_name"
eg: NM_002026 NM_054034 NM_212474 NM_212475 NM_212476 NM_212478 are all for FN1 transcript variants
Is it the general way to count them as one variant on a gene?
3. In the name of columns, do "var_seq1" and "var_seq2" mean Homo or Hetro variants? I found out if they are same the zygosity of the variants is Homo, otherwise it is Hetro.
Please bear my dumb questions, I only start my learning.

Thanks again

**ulz_peter** · 06-17-2011, 02:52 AM

I actually have never workd with data from DNAnexus so I can't really help you with that. Didn't it come with a manual? That should explain everything.

Be sure not to discard intronic SNPs too fast, they could contain a splice site mutation.
I guess the duplicates in the file are just the SNPs found in the different isoforms of the same gene but in the same genomic location.
No idea about the var_seq1 and the var_seq2 columns...

**Maone** · 06-17-2011, 07:32 AM

Thanks ulz peter. I did read their manu and got no clue on this. I will be more careful on intronic SNPs.

Topics	Statistics	Last Post
Sequencing the Two-Toed Sloth Genome Reveals Jumping Genes Tied to Its Extreme Metabolism by SEQadmin2 Started by SEQadmin2, 06-09-2026, 11:58 AM	0 responses 14 views 0 reactions	Last Post by SEQadmin2 06-09-2026, 11:58 AM
A New Method Makes Hantavirus Genome Analysis Faster and More Accessible by SEQadmin2 Started by SEQadmin2, 06-05-2026, 10:09 AM	0 responses 26 views 0 reactions	Last Post by SEQadmin2 06-05-2026, 10:09 AM
A New Single-Cell Method Maps DNA-Protein Interactions by SEQadmin2 Started by SEQadmin2, 06-04-2026, 08:59 AM	0 responses 37 views 0 reactions	Last Post by SEQadmin2 06-04-2026, 08:59 AM
Long-Read RNA Sequencing Uncovers a Hidden Layer of Immune Cell Regulation by SEQadmin2 Started by SEQadmin2, 06-02-2026, 12:03 PM	0 responses 61 views 0 reactions	Last Post by SEQadmin2 06-02-2026, 12:03 PM

Unconfigured Ad

Qs in exome sequencing data analysis

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News