Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Assessing a 454 run?

    Hi all,

    We recently got data from a 1/8th of a 454 run. The read length shows
    the typical distribution that I have seen at various meetings (see
    attached image). However, how should I go about assessing the 'overall
    quality' of the run (if such a clear cut concept exists)...

    So far I have plotted the distribution of quality per base and the
    distribution of mean quality per read. Of course the qualities will
    never be 'perfect', but without any experience or any other reference,
    I don't know what kind of distributions I should be looking for. i.e.
    we see about 20% of all bases with a quality score below 20... is that
    a) as good as we are likely to get, b) not bad, c) woah! ask for 20%
    of your money back ;-)

    It would be great to get any feedback from the experience on the forum.


    Note that we do not have a reference genome to align the reads to, but
    we do have a reasonable coverage of the chloroplast DNA, and a
    reference for that (estimated 2-4 % chloroplast contamination by read,
    giving approximately 10x coverage). What is a good tool to identify
    SNPs between our read data and that reference? (If I can first
    identify the SNPs, I can then estimate the per base error rate using
    the reference).

    (Actually I found I can do this with MAQ, but I'll leave the question in in case there are alternative suggestions).


    Thanks very much for any information,
    Dan.

    Homepage: Dan Bolser
    MetaBase the database of biological databases.

  • #2
    you might try Gabor Marth's lab's tools (http://bioinformatics.bc.edu/marthlab/Main_Page) ... use Mosaik to align the reads to the reference, and GigaBayes (evolution of their polyBayes tool) to call SNPs from that alignment. In my recollection, it gives you some better control over whether you're looking for SNPs between homozygous or heterozygous individuals, many individuals, etc, and has sound statistical underpinnings to its algorithms.

    ~Joe

    Comment

    Latest Articles

    Collapse

    • seqadmin
      Strategies for Sequencing Challenging Samples
      by seqadmin


      Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
      03-22-2024, 06:39 AM
    • seqadmin
      Techniques and Challenges in Conservation Genomics
      by seqadmin



      The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

      Avian Conservation
      Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
      03-08-2024, 10:41 AM

    ad_right_rmr

    Collapse

    News

    Collapse

    Topics Statistics Last Post
    Started by seqadmin, 03-27-2024, 06:37 PM
    0 responses
    16 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-27-2024, 06:07 PM
    0 responses
    13 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-22-2024, 10:03 AM
    0 responses
    57 views
    0 likes
    Last Post seqadmin  
    Started by seqadmin, 03-21-2024, 07:32 AM
    0 responses
    70 views
    0 likes
    Last Post seqadmin  
    Working...
    X