Go Back   SEQanswers > Bioinformatics > Bioinformatics

Similar Threads
Thread Thread Starter Forum Replies Last Post
Why reads in unmapped.bam still align to reference genome? SpreeFu Bioinformatics 7 09-28-2014 09:14 PM
How to align contigs without reference genome? yangfangisok Bioinformatics 5 06-17-2014 09:34 AM
why single end reads align to the reference genome reversely kcm.eid RNA Sequencing 3 04-24-2014 05:33 AM
Mapping reads to reference genome + count reads of genes cumulonimbus RNA Sequencing 12 10-02-2013 08:07 AM
how to align the contigs to the reference genome jjjscuedu Bioinformatics 1 06-05-2012 08:39 AM

Thread Tools
Old 08-17-2014, 08:14 PM   #1
Junior Member
Location: Chicago

Join Date: Aug 2014
Posts: 1
Default How to align reads to other reads (not to reference genome)


I have a 5 yeast genomes (Illumina MiSeq) -- I will call them A, B, C, D, and E.

I grew a colony of yeast over a period of a couple days under four different conditions in order to see how each particular condition would affect the genome of the yeast. These conditions should induce mutations in the DNA of the yeast.

Thus, A is the genome of the yeast I started with, and B-E (one for each condition) are the "altered" genomes of the yeast that I ended up at the end of the experiment.

I have aligned A-E to the S288C S. cerevisiae reference genome using BWA and called SNPs through two methods (with the mpileup function in SAMtools; also, via GATK and VCFtools), but these methods haven't quite given me the results I wanted.

To be brief, when mapped to the reference genome, Genomes B-E show fewer mutations than when Genome A is mapped to the reference genome. I would like to align B-E to A to call SNPs/INDELs. I hope, in this way, that I can get a better fit of B-E onto A (since B-E are more closely related to A than they are to the reference genome, and thus should provide a better fit).

How do I go about mapping B-E onto A? Do I need to process A to serve as a reference genome, and, if so, how would I do that? I have A-E as .fastq files, as well as all the .SAM and .BAM files after aligning to the reference genome.

I will readily admit I am not particularly good at working with computers, but if you request any other information, please let me know.
username111 is offline   Reply With Quote
Old 08-17-2014, 09:21 PM   #2
Brian Bushnell
Super Moderator
Location: Walnut Creek, CA

Join Date: Jan 2014
Posts: 2,707

That's kind of difficult. You could assemble dataset A into contigs, map the other datasets to it, and call variations, which is easy - but the coordinates of the variations would differ from your original reference so it might not be very informative. You'd have to perform some additional steps to determine which mutation goes with which gene, though it is doable.

It depends on your goal, of course (could you clarify it?) but the best solution may be to modify the original genome by applying the called variations from A to it, then mapping everything else to the modified genome. Due to indels, the coordinates would still change (though perhaps only slightly in this case) so it would be more difficult to analyze, but that's probably the best way to determine the difference between A and the other samples while retaining a structure similar to the reference.

If you want to do an analysis with respect to the reference coordinates (which is the most straightforward method), it's best to simply map everything to the reference and compare the variations, as you are already doing. How exactly is it not giving the results you expect?
Brian Bushnell is offline   Reply With Quote
Old 08-18-2014, 07:49 AM   #3
Just a member
Location: Southern EU

Join Date: Nov 2012
Posts: 103

If you are looking for reference-free SNP calling -forgive me if I've read too quickly- you might try KisSnp and/or take a look at this review.

Last edited by syfo; 08-18-2014 at 07:50 AM. Reason: link fix
syfo is offline   Reply With Quote

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

All times are GMT -8. The time now is 12:57 PM.

Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO