![]() |
|
![]() |
||||
Thread | Thread Starter | Forum | Replies | Last Post |
Whitespace in FASTA file | mnkyboy | Bioinformatics | 2 | 11-13-2011 07:49 PM |
Exclude chrM, chrUn* from reference // htseq-count warning on chrM | ocs | Bioinformatics | 10 | 11-02-2011 10:21 AM |
SpliceMap Gene annotations file for hg19 | trickytank | Bioinformatics | 0 | 01-18-2011 04:44 PM |
Tophat - fasta file | ytmnd85 | Bioinformatics | 0 | 01-19-2010 12:38 PM |
Bowtie-build index file generation, no *.3.ebwt, *.4.ebwt | Bardj | Bioinformatics | 4 | 12-18-2009 09:32 AM |
![]() |
|
Thread Tools |
![]() |
#1 |
Junior Member
Location: Houston Join Date: Apr 2011
Posts: 4
|
![]()
Dear all,
I have downloaded the hg19 from UCSC and want to combine all the chromosome together to generate the reference genome to be indexd in BWA. However, there are some files like chrUn_g*.fa, chr1_gl*_random.fa, and chrM.fa. Should I combine these files together with the autosome and X,Y chromosome or throw them away before build the reference genome? Thanks a billion! June |
![]() |
![]() |
![]() |
#2 |
Senior Member
Location: bethesda Join Date: Feb 2009
Posts: 700
|
![]()
Use chr1-22,X,Y,M
...or ... Use them all except the hap ones!!! Either way is fine. Many people stick with 1-22,X,Y,M to keep things simple. The "hap" ones are alternate assemblies for certain regions. DO NOT USE THE *hap* files !!!! |
![]() |
![]() |
![]() |
#3 |
Senior Member
Location: Boston Join Date: Feb 2008
Posts: 693
|
![]()
Use this:
ftp://ftp.ncbi.nih.gov/1000genomes/f...k_v37.fasta.gz If you prefer to build your own, include those _random. |
![]() |
![]() |
![]() |
#4 |
Junior Member
Location: Tokyo Join Date: May 2011
Posts: 1
|
![]()
Hi! I hardly recommended to use chrUn_g*.fa, chr1_gl*_random.fa, and chrM.fa for reducing miss alignment reads from these scaffolds.
|
![]() |
![]() |
![]() |
#5 |
Junior Member
Location: Houston Join Date: Apr 2011
Posts: 4
|
![]()
Thanks a billion for the kindness and help from all of you. By the way, what are the exact meanings of the file chrUn_g*.fa, chr1_gl*_random.fa ?
|
![]() |
![]() |
![]() |
#6 | |
Member
Location: NY, US Join Date: Jul 2008
Posts: 17
|
![]() Quote:
More information you may check this: http://www.ncbi.nlm.nih.gov/projects...initions.shtml For what I checked, some chrUn contigs have also some variants of rRNAs or such things. But I think it is not a serious thing. Last edited by ishmael; 05-31-2011 at 09:09 PM. Reason: typo revising |
|
![]() |
![]() |
![]() |
#7 |
Junior Member
Location: Houston Join Date: Apr 2011
Posts: 4
|
![]()
So grateful for your reply. Because I only need the count data but not the variants, the deletion of that random contigs might be better to avoid confusion.
|
![]() |
![]() |
![]() |
#8 |
Member
Location: usa Join Date: Jan 2012
Posts: 21
|
![]()
what is hap region? thx
|
![]() |
![]() |
![]() |
Tags |
chrm, haploid, hg19, random |
Thread Tools | |
|
|