SEQanswers

SEQanswers (http://seqanswers.com/forums/index.php)
-   Bioinformatics (http://seqanswers.com/forums/forumdisplay.php?f=18)
-   -   Amino Acid Sequence from Exome Data? (http://seqanswers.com/forums/showthread.php?t=65434)

thickrick99 01-09-2016 08:02 AM

Amino Acid Sequence from Exome Data?
 
Hi Everyone,

I working on a project that requires an amino acid sequence of exome sequence data. My first approach was to use samtool's mpileup command to get a consensus sequence from the exome sequencing data (bam file) followed by bcftools. Here are the commands that I used:

Code:

samtools mpileup -g -f [reference.fa] -r 11:5225466-5227071 [sorted .bam file] > [intermediate.bcf]

bcftools view [intermediate.bcf] > output.txt

However, I checked the sequence that I got from this consensus and it doesn't match any of the sequence from the input region that I used in mpileup. Moreover, I found that the sequence has an immediate stop codon after four amino acids, which is not correct. This is the HBB gene if that helps.

Also, I used the HG00096.mapped.illumina.mosaik.GBR.exome.20110411.bam for my exome sequence and the 1000 genomes project reference file for the fasta reference input.

Any suggestions on how I can extract the amino acid sequence of a gene from the exome sequence data?

Thanks in advance!

GenoMax 01-09-2016 03:08 PM

Part that "it doesn't match any of the sequence from the input region that I used in mpileup' is worrisome. Are you sure about that?

If this was a stranded data you may want to check all three forward frames (or all 6 if not stranded).


All times are GMT -8. The time now is 01:42 AM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.