SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
TopHat: Aligning SOLiD strand specific RNAseq Ender985 RNA Sequencing 3 07-09-2012 10:37 PM
Extracting specific regions from binary .map file mhayes Bioinformatics 0 08-15-2011 12:14 PM
problem aligning SOLiD reads with BWA rhys SOLiD 4 06-20-2010 03:58 AM
PubMed: High-throughput sequencing of microdissected chromosomal regions. Newsbot! Literature Watch 0 11-06-2009 02:00 AM
aligning solid data with bwa -- Seg Fault drio Bioinformatics 2 10-26-2009 04:38 AM

Reply
 
Thread Tools
Old 04-28-2011, 04:04 AM   #1
oliver_ngs
Junior Member
 
Location: Cambridge, UK

Join Date: Apr 2011
Posts: 4
Default Aligning to specific chromosomal regions using BWA.

Hi,

I have done some Illumina next generation sequencing after SureSelect target enrichment, and I want to align my reads to a specific region of the genome using BWA.

My fasta file consist of the specific 3 Mb targeted. If the name in the fasta file is, example >chr7:15000000-18000000, BWA doesn't seem to recognize this as the target interval and assumes the first base is at position one. The alignments are then in the wrong position when I try to import into IGV, and the gene annotations are all incorrect.

Is there a way of getting BWA to recognize the target interval, so alignment will be given the correct genomic position if I just use part of a chromosome as the reference sequence? Or do I need to align my data to an entire chromosome/genome so that the positions are correct?

Any suggestion would be very much appreciated!

Cheers,
Oliver
oliver_ngs is offline   Reply With Quote
Old 04-28-2011, 04:16 AM   #2
ttnguyen
Member
 
Location: Ireland

Join Date: Mar 2010
Posts: 41
Default

I think a simple way to do is aligning your data to entire genome then only select the alignments that belong to your specific regions.

Other way maybe building a new index for your fasta file and then map your reads to this and after mapping step, you need to convert the 'local' coordinate to the 'global' coordinate.
ttnguyen is offline   Reply With Quote
Old 04-28-2011, 04:25 AM   #3
gaffa
Member
 
Location: Gothenburg/Uppsala, Sweden

Join Date: Oct 2010
Posts: 82
Default

I doubt that you could make BWA assign coordinates differently. But you could possibly change the coordinates yourself in the resulting SAM file, i.e. add 15000000-1 to every coordinate or whatever.

However, you might want to consider mapping to the whole genome instead, for conceptual reasons. Target enrichment is not 100% efficient, so you are at very high risk of mis-mapping reads if you use only a specific region as your reference.

There has been some discussion of this on Biostar:
http://biostar.stackexchange.com/que...xome/7617#7617
http://biostar.stackexchange.com/questions/4413
gaffa is offline   Reply With Quote
Old 04-28-2011, 04:54 AM   #4
krobison
Senior Member
 
Location: Boston area

Join Date: Nov 2007
Posts: 747
Default

Aligning to a subset of the genome is always a bad idea. Remember that the aligner tries to find the best match between your sequence and the given genome. With a subset, you run a risk of getting an incorrect alignment between your sequence and the restricted reference which would not occur if the aligner had the entire genome (and would therefore find the correct match). The speed improvement you will get will be modest & the downstream informatics more painful.

If you are really, really stubborn about this, then generate masked versions of your chromosomes where everything except your target regions (and a few hundred basepairs of neighboring sequence) are replaced by N. That way your coordinates are preserved.
krobison is offline   Reply With Quote
Old 04-28-2011, 05:20 AM   #5
oliver_ngs
Junior Member
 
Location: Cambridge, UK

Join Date: Apr 2011
Posts: 4
Default

Thank you all very much for the advice. I'll try aligning to the entire genome. Cheers, Oliver.
oliver_ngs is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 06:22 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2020, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO