Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Similarity programs for SNP files

    Hi,

    I'm a newbie to this field so please forgive the use of very basic language.

    I've got two files. Each contains position and nucleotide information on each line.

    Each file holds just the SNP information for a person. So for ex. for person A, the file would look like.
    Chromosome 6: 133,088,927, G
    Chromosome 6: 133,088,928, A
    and so on.

    The second file too has nucleotide information for the exact same locations.

    Is there a utility somewhere that will show me the similarity between the two files? Something on the lines of BLAST which ofcourse requires the full sequence information and not just SNPs.

    Prompt help will be much appreciated.

    Thanks.

    PS. the location information might not be exactly as detailed, I've oversimplified it for the sake of clarity.
    Last edited by leofixings; 05-03-2011, 07:38 AM. Reason: Missed out some information.

  • #2
    comm

    type "man comm" at command line and see if you can use the "comm" command. Perhaps simplly piping the output to "wc" might be a crude, but effective measure.

    COMM(1) User Commands COMM(1)

    NAME
    comm - compare two sorted files line by line

    SYNOPSIS
    comm [OPTION]... FILE1 FILE2

    DESCRIPTION
    Compare sorted files FILE1 and FILE2 line by line.

    With no options, produce three-column output. Column one contains lines unique to FILE1, column two contains
    lines unique to FILE2, and column three contains lines common to both files.

    -1 suppress lines unique to FILE1

    -2 suppress lines unique to FILE2

    -3 suppress lines that appear in both files

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Advancing Precision Medicine for Rare Diseases in Children
      by seqadmin




      Many organizations study rare diseases, but few have a mission as impactful as Rady Children’s Institute for Genomic Medicine (RCIGM). “We are all about changing outcomes for children,” explained Dr. Stephen Kingsmore, President and CEO of the group. The institute’s initial goal was to provide rapid diagnoses for critically ill children and shorten their diagnostic odyssey, a term used to describe the long and arduous process it takes patients to obtain an accurate...
      12-16-2024, 07:57 AM
    • seqadmin
      Recent Advances in Sequencing Technologies
      by seqadmin



      Innovations in next-generation sequencing technologies and techniques are driving more precise and comprehensive exploration of complex biological systems. Current advancements include improved accessibility for long-read sequencing and significant progress in single-cell and 3D genomics. This article explores some of the most impactful developments in the field over the past year.

      Long-Read Sequencing
      Long-read sequencing has seen remarkable advancements,...
      12-02-2024, 01:49 PM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 12-17-2024, 10:28 AM
    0 responses
    33 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 12-13-2024, 08:24 AM
    0 responses
    49 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 12-12-2024, 07:41 AM
    0 responses
    34 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 12-11-2024, 07:45 AM
    0 responses
    46 views
    0 likes
    Last Post seqadmin  
    Working...
    X