Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Find an uniqe sequence in a range of chromosome coordinates

    Hi All,

    I have a range of chromosome coordinates i.e.
    Code:
    HERC6 4:89,280,000-89,350,000
    From this range, I have to pull out a subsequence of 200-400 nucleotides. The subsequence shouldn't appear in another part of the human genome. This should be an unique sequence. (The best case, the subsequence would be an exon of HERC6, but not necessary)

    Do you have any idae how to automate this process? Which tools/services should I use?


    I know I can do this in such way:
    HTML Code:
    Ensembl -> type in the search box "4:89,280,000-89,350,000" -> export data -> cut a subsequence -> paste to the blast.
    But this way is not satisfactory.

  • #2
    Just use mabability tracks at CGWB or UCSC ...

    No need to brute force this when many folks have done the work already.

    Just use any of the mapability tracks available at UCSC or CGWB:

    See CBIIT Mapability track at CGWB:


    Note top track is based on 100bp. Set your window to 200 bp and look for all white in the Mapability track.

    There's similar mapability tracks at UCSC :

    Comment


    • #3
      Originally posted by thedamian View Post
      Hi All,

      From this range, I have to pull out a subsequence of 200-400 nucleotides. The subsequence shouldn't appear in another part of the human genome. This should be an unique sequence. (The best case, the subsequence would be an exon of HERC6, but not necessary)

      Do you have any idae how to automate this process? Which tools/services should I use?
      Hello,

      I think vmatch can do what you want, it's a very flexible tool. In particular look at examples 9.9.1 and 9.9.5 of the manual

      All the best
      Dario

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin




        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
        04-22-2024, 07:01 AM
      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      59 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      57 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      51 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      55 views
      0 likes
      Last Post seqadmin  
      Working...
      X