Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Should I use chrM,random and haploid fasta file to build hg19?

    Dear all,
    I have downloaded the hg19 from UCSC and want to combine all the chromosome together to generate the reference genome to be indexd in BWA. However, there are some files like chrUn_g*.fa, chr1_gl*_random.fa, and chrM.fa. Should I combine these files together with the autosome and X,Y
    chromosome or throw them away before build the reference genome?

    Thanks a billion!

    June

  • #2
    Use chr1-22,X,Y,M
    ...or ...
    Use them all except the hap ones!!!
    Either way is fine.

    Many people stick with 1-22,X,Y,M to keep things simple.


    The "hap" ones are alternate assemblies for certain regions.
    DO NOT USE THE *hap* files !!!!

    Comment


    • #3
      Use this:

      ftp://ftp.ncbi.nih.gov/1000genomes/f...k_v37.fasta.gz

      If you prefer to build your own, include those _random.

      Comment


      • #4
        Hi! I hardly recommended to use chrUn_g*.fa, chr1_gl*_random.fa, and chrM.fa for reducing miss alignment reads from these scaffolds.

        Comment


        • #5
          Thanks a billion for the kindness and help from all of you. By the way, what are the exact meanings of the file chrUn_g*.fa, chr1_gl*_random.fa ?

          Comment


          • #6
            Originally posted by mozart View Post
            Thanks a billion for the kindness and help from all of you. By the way, what are the exact meanings of the file chrUn_g*.fa, chr1_gl*_random.fa ?
            For what I know, these are the contigs of genome that are not quite sure the exact position. Because there are many factors effect the assembling of genome, so some contigs the consortium didn't integarate with whole genome, just labeled as chrUn_* (not sure which chromosome come from) or chr1_*_random (from chr1 already known). And for hg19, there are patches released when the consortium integerate the contig with genome (in hg19, the coordinates are reversed for contigs, so in the version hg19, the integeration doesn't effect the already sequence coordinates).
            More information you may check this:
            http://www.ncbi.nlm.nih.gov/projects...initions.shtml

            For what I checked, some chrUn contigs have also some variants of rRNAs or such things. But I think it is not a serious thing.
            Last edited by ishmael; 05-31-2011, 09:09 PM. Reason: typo revising

            Comment


            • #7
              So grateful for your reply. Because I only need the count data but not the variants, the deletion of that random contigs might be better to avoid confusion.

              Comment


              • #8
                more questions

                what is hap region? thx

                Comment

                Latest Articles

                Collapse

                • seqadmin
                  Current Approaches to Protein Sequencing
                  by seqadmin


                  Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                  04-04-2024, 04:25 PM
                • seqadmin
                  Strategies for Sequencing Challenging Samples
                  by seqadmin


                  Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                  03-22-2024, 06:39 AM

                ad_right_rmr

                Collapse

                News

                Collapse

                Topics Statistics Last Post
                Started by seqadmin, 04-11-2024, 12:08 PM
                0 responses
                17 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 10:19 PM
                0 responses
                22 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-10-2024, 09:21 AM
                0 responses
                16 views
                0 likes
                Last Post seqadmin  
                Started by seqadmin, 04-04-2024, 09:00 AM
                0 responses
                46 views
                0 likes
                Last Post seqadmin  
                Working...
                X