SEQanswers (
-   Bioinformatics (
-   -   Is SAMtools the right software for this project? (

thickrick99 12-18-2015 03:26 PM

Is SAMtools the right software for this project?
Hi Everyone,

Here is a brief summary of what I am trying to do with my project: Essentially, I want to find how a mutation in the mRNA sequence affects the amino acid sequence of a protein. I have whole exome sequencing data as a .sam file and I am interested in finding the flanking sequence +X nucleotides and -X nucleotides upstream and downstream from a specific site of the mutation. From here, I want to determine the amino acid sequence of that flanking sequence but it has to be correctly in frame from the original sequence.

Here are a few questions that I had in terms of using SAMtools and accomplishing these tasks:

1) I assume I need to find the consensus sequence for the reads in my whole exome sequencing data and how would I be able to do this with SAMtools. I found the mpileup command, but what would be the the reference fasta file in my case. Is finding the consensus even needed?

2) My main issue is going from the .sam file reads to being able to pinpoint the location of interest and get the flanking sequence. What do I need to do to process the .sam exome sequencing file to be able to determine the flanking sequence?

3) Once i find the flanking sequence, how do I figure out the amino acid sequence and adjust accordingly to make sure it is in frame?

4) How do i account for the multiple transcripts that may exist for a particular gene because of alternative splicing?

Sorry for all the questions, it is my first time working in this area. I appreciate any help! Thanks in advance!

colindaven 12-28-2015 12:03 AM

There are tools for this - snpEff and Annovar are popular.

Snpeff is quite simple too. You SNP will need to be in an _annotated_ mRNA seq of course.

Input is VCF format.

thickrick99 12-28-2015 11:18 AM

Thanks for the response! Both of these tools seem helpful, however do they output the sequence of the flanking region/altered amino acid sequence as well? I quickly looked through snpEff and Annovar and it seems like the tools only tell you what the impact of an SNP is or what the amino acid change is. Im interested in not only determine what the amino acid change is, but also using the mutated amino acid sequence after the SNP for further analysis.

Do you know if these tools/other tools are able to accomplish this?

All times are GMT -8. The time now is 09:02 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.