SEQanswers

Go Back   SEQanswers > Bioinformatics > Bioinformatics



Similar Threads
Thread Thread Starter Forum Replies Last Post
Targeted Genome Assembly for region poorly represented in reference genome? gumbos Bioinformatics 1 01-09-2012 04:01 PM
Please help: imperfect reference genome/get consensus on genome/read alignment? KAP Bioinformatics 1 08-19-2011 07:14 AM
transferring annotations from reference genome to the resequenced genome mike.t Bioinformatics 1 09-17-2010 05:35 AM
Reference genome bair Bioinformatics 10 07-18-2010 08:49 PM
Reference Genome Macki1x Bioinformatics 1 07-30-2008 04:58 PM

Reply
 
Thread Tools
Old 01-30-2009, 10:17 AM   #1
inesdesantiago
Member
 
Location: LONDON, UNITED KINGDOM

Join Date: Jan 2009
Posts: 44
Cool Reference genome for MAQ - split reference genome by chromosome or not?

Hello!
I am a beginner!
I am trying MAQ..
My question is about the reference genome! I downloaded the mm9 genome from UCSC, and it comes as separated chr*.fa files (one fasta per chromosome)

However, the MAQ command to do the alignments points to a single file as the reference genome:

maq match output.map genome.bfa myreads.bfq

I converted all my chr.fasta files to bfa files using maq "fasta2bfa".
Now, I don't know if i am supposed to run the MAQ for each single chromosome individually or if I should have the complete genome in one-single-bfa-file.

Any of these is a challenge.
If we the alignment is to run chromosome by chromosome then there should be a way to merge the output files (I think...!?).
One the other hand, if the genome is supposed to be in just one-file, how do I do that?

Any thoughts?
THanks!!
ines
inesdesantiago is offline   Reply With Quote
Old 01-30-2009, 12:39 PM   #2
bioinfosm
Senior Member
 
Location: USA

Join Date: Jan 2008
Posts: 482
Default

here is a similar discussion => don't split by chromosome
http://seqanswers.com/forums/showthr...=1020#post1020
bioinfosm is offline   Reply With Quote
Old 01-31-2009, 02:31 AM   #3
inesdesantiago
Member
 
Location: LONDON, UNITED KINGDOM

Join Date: Jan 2009
Posts: 44
Default

Good to know!

I can't find the complete genome in the UCSC database.
Should I do it myself? merge my chromosome.fa files into one-big file looking like this:

>chr1
AACTGTGCACTGTGACAC...
GTACGCACGTGCGTGCAC...
>chr2
ACATTGCCAACACTGTCA...
ACACGTGCGTGCACACGT...
>chr xyz

I don't know if this is the right format...
inesdesantiago is offline   Reply With Quote
Old 01-31-2009, 10:14 AM   #4
inesdesantiago
Member
 
Location: LONDON, UNITED KINGDOM

Join Date: Jan 2009
Posts: 44
Default

just a quick reply to myself:
Yes, that's the right format!
inesdesantiago is offline   Reply With Quote
Old 02-18-2009, 08:44 AM   #5
bioinfosm
Senior Member
 
Location: USA

Join Date: Jan 2008
Posts: 482
Default

Yes, sometimes eland gives issues with fasta headers, it creates extra columns in the export or eland_extended output. So I also prefer to keep the fasta reference sequence headers small
bioinfosm is offline   Reply With Quote
Reply

Tags
maq split reference

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 03:02 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO