SEQanswers

Go Back   SEQanswers > Applications Forums > RNA Sequencing



Similar Threads
Thread Thread Starter Forum Replies Last Post
Whitespace in FASTA file mnkyboy Bioinformatics 2 11-13-2011 07:49 PM
Exclude chrM, chrUn* from reference // htseq-count warning on chrM ocs Bioinformatics 10 11-02-2011 10:21 AM
SpliceMap Gene annotations file for hg19 trickytank Bioinformatics 0 01-18-2011 04:44 PM
Tophat - fasta file ytmnd85 Bioinformatics 0 01-19-2010 12:38 PM
Bowtie-build index file generation, no *.3.ebwt, *.4.ebwt Bardj Bioinformatics 4 12-18-2009 09:32 AM

Reply
 
Thread Tools
Old 05-27-2011, 12:07 PM   #1
mozart
Junior Member
 
Location: Houston

Join Date: Apr 2011
Posts: 4
Unhappy Should I use chrM,random and haploid fasta file to build hg19?

Dear all,
I have downloaded the hg19 from UCSC and want to combine all the chromosome together to generate the reference genome to be indexd in BWA. However, there are some files like chrUn_g*.fa, chr1_gl*_random.fa, and chrM.fa. Should I combine these files together with the autosome and X,Y
chromosome or throw them away before build the reference genome?

Thanks a billion!

June
mozart is offline   Reply With Quote
Old 05-27-2011, 12:26 PM   #2
Richard Finney
Senior Member
 
Location: bethesda

Join Date: Feb 2009
Posts: 700
Default

Use chr1-22,X,Y,M
...or ...
Use them all except the hap ones!!!
Either way is fine.

Many people stick with 1-22,X,Y,M to keep things simple.


The "hap" ones are alternate assemblies for certain regions.
DO NOT USE THE *hap* files !!!!
Richard Finney is offline   Reply With Quote
Old 05-27-2011, 04:35 PM   #3
lh3
Senior Member
 
Location: Boston

Join Date: Feb 2008
Posts: 693
Default

Use this:

ftp://ftp.ncbi.nih.gov/1000genomes/f...k_v37.fasta.gz

If you prefer to build your own, include those _random.
lh3 is offline   Reply With Quote
Old 05-30-2011, 09:33 PM   #4
Kaonashi
Junior Member
 
Location: Tokyo

Join Date: May 2011
Posts: 1
Default

Hi! I hardly recommended to use chrUn_g*.fa, chr1_gl*_random.fa, and chrM.fa for reducing miss alignment reads from these scaffolds.
Kaonashi is offline   Reply With Quote
Old 05-31-2011, 06:27 AM   #5
mozart
Junior Member
 
Location: Houston

Join Date: Apr 2011
Posts: 4
Default

Thanks a billion for the kindness and help from all of you. By the way, what are the exact meanings of the file chrUn_g*.fa, chr1_gl*_random.fa ?
mozart is offline   Reply With Quote
Old 05-31-2011, 09:08 PM   #6
ishmael
Member
 
Location: NY, US

Join Date: Jul 2008
Posts: 17
Default

Quote:
Originally Posted by mozart View Post
Thanks a billion for the kindness and help from all of you. By the way, what are the exact meanings of the file chrUn_g*.fa, chr1_gl*_random.fa ?
For what I know, these are the contigs of genome that are not quite sure the exact position. Because there are many factors effect the assembling of genome, so some contigs the consortium didn't integarate with whole genome, just labeled as chrUn_* (not sure which chromosome come from) or chr1_*_random (from chr1 already known). And for hg19, there are patches released when the consortium integerate the contig with genome (in hg19, the coordinates are reversed for contigs, so in the version hg19, the integeration doesn't effect the already sequence coordinates).
More information you may check this:
http://www.ncbi.nlm.nih.gov/projects...initions.shtml

For what I checked, some chrUn contigs have also some variants of rRNAs or such things. But I think it is not a serious thing.

Last edited by ishmael; 05-31-2011 at 09:09 PM. Reason: typo revising
ishmael is offline   Reply With Quote
Old 06-02-2011, 04:47 PM   #7
mozart
Junior Member
 
Location: Houston

Join Date: Apr 2011
Posts: 4
Default

So grateful for your reply. Because I only need the count data but not the variants, the deletion of that random contigs might be better to avoid confusion.
mozart is offline   Reply With Quote
Old 04-18-2013, 12:07 AM   #8
dejavu2010
Member
 
Location: usa

Join Date: Jan 2012
Posts: 21
Default more questions

what is hap region? thx
dejavu2010 is offline   Reply With Quote
Reply

Tags
chrm, haploid, hg19, random

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:49 PM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2021, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO