Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Possible to force contig builds from a selected region?

    Hello SEQers,

    We have a series of 10 bacterial genomes sequenced with Illumina (300-base PE reads, before read processing). We want to find SND/InDels responsible for a phenotype in 7 of them.

    Using different mappers (BWA-mem, bowtie2, cushaw2) with reads processed to different extents (through trim, trimmomatic, BBduk) I can get maps that provide variant calls (mpileup or FreeBayes) for several established/known mutations (relative to the reference), but some assemblers give maps that show a change in certain genes and others don't (I can detail the differences in the maps if needed). One simple approach to interrogate the promising variant calls is to compare the same loci to those found in de novo assemblies from the same reads (an unbiased sequence).

    Unfortunately, the genome assemblers we have used repeatedly spit out contigs from other regions, not the 2 or 3 specific locations we are interested in.

    I don't need to build a complete genome and general gap-filling strategies do not guarantee a contig build covering the regions of interest.

    Is it possible to "force" a contig builder (Velvet, BBmap, A5, etc.) to build from seeded region(s) adjacent to a locus of interest? Say, maybe start 500 bases to the left or right, then let the contig grow across the mutant locus.

    All we need is some independent way (not reference-influenced) of looking at the consensus contig that covers that particular locus to strengthen our variant lists before we start Sanger sequencing potential hits.

    Thanks for any help.

  • #2
    I believe that the PRICE assembler (http://derisilab.ucsf.edu/software/price/) is constructed for this use case. You might want to try it. I haven't had time to test it yet, but plan to apply it to a similar scenario where we want to "flesh out" an existing contig.

    Comment


    • #3
      Thanks kopi-o.
      The description seems right, thanks! It won't compile here at work (and not in mood to deal with yet another install project), so I'll check it out when I get home.

      S

      Comment


      • #4
        the PRICE was right

        I wanted to update the thread to say that PRICE did the trick.

        At first, I had trouble getting contains to extend into certain regions (which were ambiguous in the maps as well). After consulting with the author (who was more than helpful), I ended up combining my reads into a single file and using the input option "-spfp [file.fastq]", which basically cuts each read in half and uses them as their own read pairs during assembly. This strategy avoids stalling caused by read pairs that contain substantial overlap with the partner (these were 300 base reads from a ~400 fragment library before processing, so there was likely to be a lot of overlap).

        Once optimized (adjusting kmer, cut size, percent match, etc.), I could feed in a ~300 nt sequence from a region and let it grow over a region of interest. It resolved several ambiguous sites and also identified an IS insertion in one of the genes of interest (which was missed by BWA-mem, but "noisy" in Bowtie2).

        I also easily assembled a plasmid that was in one of the strains using seed "contigs" from known regions. We had never fully sequenced it and this was bonus data.

        Thanks again for the pointer.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Recent Advances in Sequencing Analysis Tools
          by seqadmin


          The sequencing world is rapidly changing due to declining costs, enhanced accuracies, and the advent of newer, cutting-edge instruments. Equally important to these developments are improvements in sequencing analysis, a process that converts vast amounts of raw data into a comprehensible and meaningful form. This complex task requires expertise and the right analysis tools. In this article, we highlight the progress and innovation in sequencing analysis by reviewing several of the...
          Today, 07:48 AM
        • seqadmin
          Essential Discoveries and Tools in Epitranscriptomics
          by seqadmin




          The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist...
          04-22-2024, 07:01 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, Today, 07:17 AM
        0 responses
        11 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 05-02-2024, 08:06 AM
        0 responses
        19 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-30-2024, 12:17 PM
        0 responses
        20 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-29-2024, 10:49 AM
        0 responses
        28 views
        0 likes
        Last Post seqadmin  
        Working...
        X