Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Short read alignments between species

    Hi,

    I have some Illumina paired end genomic reads from a plant species without a genome sequence, so I wanted to align them to a related genome. I tried using bowtie ( --seedmms 3 --maqerr 250) but I am getting very few alignments (<5% paired ends, and ~10% for each end separately). I tried to use the -v option to increase the mismatches but the limit seems to be 3 (same as the seed mismatches acc to the manual). I guess my genetic distance is too great...

    Do people have a preferred aligner when aligning to a reference from another species, or would I be better off assembling the reads de novo and aligning them afterwards?

    Thanks, SD

  • #2
    If you want to continue using bowtie you should increase the allowed error to something much higher.

    I have also used mosaik for this kind of thing as there you can allow many more MM or specify an allowed %. One issue you will face is that to get enough reads mapping you will likely have to increase the allowed MM to such a degree that mapping becomes so ambiguous that the whole thing can be of questionable value.

    I think would go with your second option of a de novo assembly and aligning the assembled contigs. However, that's a whole other world of pain and your success will highly depend on how much Illumina data you have and what combinations of library insert sizes and very much on the polymorphism rate of your species. Here are some very brief comments on some of the available assemblers:

    MIRA: Probably not worth trying unless your genome is very small because it has such high memory requirements

    SOAPdenovo: Many people report OK results but you will likely get a very large number of very short contigs. The documentation is terrible and the maillist is far from the best because the developers don't seem to read it.

    ABySS: Great mailing list and gives about the best result. The developers are really helpful and users on the mailing list will help will anything from newbie to advanced issues.

    Velvet: OK for small genomes but has really high RAM requirements otherwise (but not as bad as MIRA).

    Celerea(Caborg): I can't say because I haven't reied it yet but they recently added full support for Illumina data.

    clc: commercial so maybe not an option. Also a a rather mysterious black box but it is incredibly fast and has amazingly low RAM requirements (but a black box so who knows how they manage this).

    Comment


    • #3
      LASTZ has a specific module to perform exomapping, called FEAST..;

      See http://www.ncbi.nlm.nih.gov/pubmed/20733242

      Either use BWA (which accept more SNPs than bowtie and can manage indels) with relaxed states
      Francois Sabot, PhD

      Be realistic. Demand the Impossible.
      www.wikiposon.org

      Comment


      • #4
        Since you already tried mapping and got few alignments, I don't think assembly will produce a better result. However, as francois.sabot said, you can try relaxing the mapping criteria (higher mismatches, larger gaps), and imo you better off using the hash-based mapping tool (maq,rmap,etc.) for that purpose.

        Comment

        Latest Articles

        Collapse

        • seqadmin
          Current Approaches to Protein Sequencing
          by seqadmin


          Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
          04-04-2024, 04:25 PM
        • seqadmin
          Strategies for Sequencing Challenging Samples
          by seqadmin


          Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
          03-22-2024, 06:39 AM

        ad_right_rmr

        Collapse

        News

        Collapse

        Topics Statistics Last Post
        Started by seqadmin, 04-11-2024, 12:08 PM
        0 responses
        17 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 10:19 PM
        0 responses
        22 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-10-2024, 09:21 AM
        0 responses
        16 views
        0 likes
        Last Post seqadmin  
        Started by seqadmin, 04-04-2024, 09:00 AM
        0 responses
        46 views
        0 likes
        Last Post seqadmin  
        Working...
        X