SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Exome-seq: looking for common mutations among my samples blakeoft Bioinformatics 1 04-02-2014 02:36 PM
How to check HCV genomic sequences between Two samples is different or not? byou678 Bioinformatics 1 10-30-2012 02:53 PM
How to differentiate driver mutations from common/random mutations and SNPs in cancer kumardeep RNA Sequencing 0 09-04-2012 02:51 AM
PubMed: Implementation of Novel Pyrosequencing Assays to Screen for Common Mutations Newsbot! Literature Watch 0 05-12-2009 06:00 AM

Reply
 
Thread Tools
Old 07-16-2014, 12:59 PM   #1
ronton
Member
 
Location: US

Join Date: Jun 2014
Posts: 34
Default How to check for mutations common across all samples?

I have .bam, .vcf, and ANNOVAR .csv files for several samples.

Is there a straightforward way to sort or view mutations that are the most common or present in several samples?

I was thinking of trying to sort the variant location columns in ascending order and line them up for all the samples, but then how would I create a list of the most common variants?

Any ideas are appreciated, thank you.
ronton is offline   Reply With Quote
Old 07-16-2014, 01:10 PM   #2
blakeoft
Member
 
Location: Connecticut

Join Date: Oct 2013
Posts: 79
Default

I had to do this once. It's not a perfect solution, but check out this thread: Get common lines from multiple files. Specifically, look at what the user Radoulov posts. Unfortunately, the solution is just provided and not explained so it might be hard to tweak it.

You'll probably have to take out a few columns of your vcf though. If you just want the chrom, pos and alt, you could do:
Code:
awk '{print $1,$2,$5}' file.vcf
I'm sure there are some other things that you would have to do with the output if it's not exactly what you want, but I think it can be done with a few unix commands.

Edit: One more thing. If you wanted to do several pairwise comparisons (which you probably don't), look into using unix's comm.

Last edited by blakeoft; 07-16-2014 at 01:14 PM.
blakeoft is offline   Reply With Quote
Old 07-16-2014, 04:23 PM   #3
ronton
Member
 
Location: US

Join Date: Jun 2014
Posts: 34
Default

Thank you for the help, I was able to do something with what you suggested.

It looks like many of the samples have tons of the same mutations so I'm not sure how fruitful this little test was. Perhaps I can refine it.
ronton is offline   Reply With Quote
Old 07-17-2014, 04:49 AM   #4
blakeoft
Member
 
Location: Connecticut

Join Date: Oct 2013
Posts: 79
Default

You might also filter out the variants in dbsnp, or at least do them separately.
blakeoft is offline   Reply With Quote
Old 07-17-2014, 10:55 AM   #5
swbarnes2
Senior Member
 
Location: San Diego

Join Date: May 2008
Posts: 912
Default

Programs like BEDTools can give you the intersection of multiple .vcf files.
swbarnes2 is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 12:42 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO