Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • GRCh37: how to use lates patches

    I would like to know how to assemble human refefence genome GRCh37 from individual chromosome files and latest patches.

    This is ensembl's ftp site which lists >300 fasta files.
    ftp://ftp.ensembl.org/pub/release-67...o_sapiens/dna/

    For primary assembly, one might simply concatenate chromosome 1, 2, ..., X, and Y. However, X and Y chromosomes share pseudoautosomal region (PAR) as README points out. I could just leave out Y chromosome since I work on K562 cells, but otherwise what am I supposed to do? There is a big file called toplevel.fa which appears to have PAR sequenes masked, but does it contain all chromosomes and patches? README does not say anything about its content.

    There seem to be two kinds of patches: fixes and novel additions. How are these patches correctly incorporated into the primary assembly? Is there a utility software to handle this? Or are patches treated as separate entities (e.g. as PATCH_xxx instead of being integrated into chromosome proper)?

    Thank you for your very kind help.
    Last edited by yujiro; 07-13-2012, 03:20 AM.

Latest Articles

Collapse

  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin


    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
    Yesterday, 07:01 AM
  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
39 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
41 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 09:21 AM
0 responses
35 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-04-2024, 09:00 AM
0 responses
55 views
0 likes
Last Post seqadmin  
Working...
X