Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Repeat masking

    Hi everyone,

    We have whole genome data from a number of tumour-normal pairs, and are looking at our score distributions for somatic changes. We find that some of the highest scores look like errors due to read pile up in repetitive regions when viewed in IGV. Thus we are beginning to think about masking repetitive parts of the genome using repeat masker or something similar.

    Does anyone have any tips on how to go about this? I've had a brief look, and to me it seems I could mask up to 50% of the genome using the least stringent criteria. Does anyone have any experience to get me started? Which programs/repeat libraries are the best to use.

    Thank you for your help,
    Amy.

  • #2
    For homology based repeat masking I would use RepeatMasker (www.repeatmasker.org/) together with Repbase (www.girinst.org/repbase/). Its easy to use and usually provides very good results.

    If you want to uncover de novo repeats there are many programs you could choose from, e.g., LTRfinder, PILER, RepeatScout.

    Comment


    • #3
      When you do your alignment, you could align for only unique sequences (sequences that match to only 1 place on the genome). This can be done in bowtie by inserting -m 1.

      Also, have a look at The Uniqueome: A mappability resource for short-tag sequencing,http://bioinformatics.oxfordjournals...s.btq640.short and here for a turorial:http://grimmond.imb.uq.edu.au/unique...ary_File_2.pdf

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin




        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
        04-22-2024, 07:01 AM
      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, Yesterday, 11:49 AM
      0 responses
      15 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-24-2024, 08:47 AM
      0 responses
      16 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      61 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      60 views
      0 likes
      Last Post seqadmin  
      Working...
      X