Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Targeted Genome Assembly for region poorly represented in reference genome?

    Hello,

    I am not sure if this is the best place to put this, but any help is appreciated.

    I am working on identifying candidate genes for a mutation in a subtelomeric region of Zebrafish. The Zv9 Assembly is not great, and is particularly bad in this region. The region contains an improperly placed clone with WGS contigs surrounding it, which contain genes known to be deleted in one of the mutant alleles but are not causative. The reference in this region is so poor that my best bet is to attempt to re-build it. I am already in contact with the people at Sanger in regards to this, but I also have quite a bit of differential RNAseq data as well as WGS data of both a different mutant allele (that causes the same phenotype but was generated via ENU mutagenesis and so is likely a point mutation).

    Does anyone have any ideas for the best way to assemble this region in a targeted fashion? I have access to a 32GB ram fairly powerful computer, but this obviously is not enough to take the large amount of WGS data I have and assemble with velvet. Even then, I feel that the velvet assembly would only give me at best 5kb contigs that won't be much more effective than what is already available. I have considered trying to align sequences to the region as it is known with bowtie then assembling those reads only, but the reference is so poor that I don't think this will be effective either (multiple genes we know to be deleted are not represented in Zv9 in any fashion, or are only partially represented, or are represented split far apart with opposite strandedness).

    Thanks in advance for any assistance or advice.

  • #2
    Yuck! You certainly have an ugly situation.

    I think if I were in your shoes I would try as many different approaches as possible to identify paired end reads where at least one of the pairs can be mapped to the region (existing assembly, mapping to your RNA-Seq data that you think is in the region, etc), then try assembling that with Velvet or another assembler. Then use that assembly to identify additional paired end reads mapping in & repeat. Keep cycling until things don't seem to be getting better.

    Is there a publicly available BAC or cosmid library for zebrafish? I doubt you'll get anything like what you want without some long-range sequence information, and pulling out a big clone for the region could be a bunch of work but is one of the more obvious ways to go about it. Alternatively, have you tried making mate pair libraries for the whole genome?

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Current Approaches to Protein Sequencing
      by seqadmin


      Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
      04-04-2024, 04:25 PM
    • seqadmin
      Strategies for Sequencing Challenging Samples
      by seqadmin


      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
      03-22-2024, 06:39 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 04-11-2024, 12:08 PM
    0 responses
    18 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 10:19 PM
    0 responses
    22 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-10-2024, 09:21 AM
    0 responses
    17 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 04-04-2024, 09:00 AM
    0 responses
    49 views
    0 likes
    Last Post seqadmin  
    Working...
    X