Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • searching for short, nearly identical peptide matches between two genomes

    I want to find all of the short, nearly identical peptide matches between two genomes. These matches are ~10bp in length, allowing for 1 or 2 mismatches, but no gaps.

    When I do the BLAST search, I can find many such matches. However, I also find longer matches that are below my identity cutoff (e.g. 30bp matches that are only 70% identity).

    From what I know about BLAST, it seems possible that there are 10bp perfect matches buried in these 30bp 70% identity matches. Is this correct? If so, can anyone recommend a way to solve this problem, or point me in the right direction?

    I will even code this up in python if that is the best way. (I guess I would split my database into words of length 3, then look at every word in my query and calculate the edit distances between all such strings... doesn't sound very fun...)
    Last edited by green tree; 05-13-2016, 09:47 PM.

  • #2
    One option is to get the full alignments (instead of tab-delimited format) and search for identical sub-alignments... doesn't sound very fun...
    Last edited by green tree; 05-13-2016, 09:54 PM.

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Essential Discoveries and Tools in Epitranscriptomics
      by seqadmin




      The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
      04-22-2024, 07:01 AM
    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, Today, 08:47 AM
    0 responses
    12 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    60 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    59 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 09:21 AM
    0 responses
    54 views
    0 likes
    Last Post seqadmin  
    Working...
    X