Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • merging and de-duplicating structural variant calls (bedpe)

    (modified from the BioStar initial post)http://seqanswers.com/forums/images/smilies/smile.gif

    I am looking for code/tools to merge and de-duplicate long lists of structural variants calls in bedpe or any other related format.

    My lines come from merging numerous genome datasets and I want to identify redundant structural events, often overlapping in both ends, sometimes only at one end but leading to the similar gene disruption effect.

    The author of Hydra (Aaron Q.) pointed me to a python script coming with his tool (dedupDiscordants.py). Thanks to Aaron's help I could finally get it to work on my data but the result is not deduplicated enough for my needs.

    I would like to have control on each overlapping end independently and on the operation to apply to found overlapping calls (merge to shortest gap or to shortest common flank pieces for instance).

    Anyone having pieces of code to find paired intersection between double coordinate calls (left arm / right arm of the junction) are welcome to comment and hopefully help.

    Thanks in advance

    Stephane
    http://www.bits.vib.be/index.php

Latest Articles

Collapse

  • seqadmin
    Essential Discoveries and Tools in Epitranscriptomics
    by seqadmin




    The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
    04-22-2024, 07:01 AM
  • seqadmin
    Current Approaches to Protein Sequencing
    by seqadmin


    Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
    04-04-2024, 04:25 PM

ad_right_rmr

Collapse

News

Collapse

Topics Statistics Last Post
Started by seqadmin, Yesterday, 11:49 AM
0 responses
15 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-24-2024, 08:47 AM
0 responses
16 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-11-2024, 12:08 PM
0 responses
61 views
0 likes
Last Post seqadmin  
Started by seqadmin, 04-10-2024, 10:19 PM
0 responses
60 views
0 likes
Last Post seqadmin  
Working...
X