Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problems in assembling mitochondrial genome

    Hi all,

    I have assembled a genome (from illumina data) for a non-model species using allpaths and I'm searching for the mitochondria. I have a list of proteins (13 in total) that should be on mitochondria and have been unable to locate them in any sensible manner in this assembly(used the proteins with exonerate, spaln, blast and blat).

    For example, blasting these proteins against the assembled genome reveals only one protein and I'm getting similar results with other approaches. I want to know why they are not showing up. I expect that the mitochondria should be assembled in a single contig (about 20kb in length) but it’s puzzling not even fragments are showing up.

    Has anyone ever run into this problem with their assembly? Or does anyone have any idea what is going on here?

    Any help would be appreciated.

    Thanks and regards,
    Ram

  • #2
    Did you do any pre-filtering? The mitochondrial reads might have been lost (e.g. due to much higher coverage, or very different %GC).

    Comment


    • #3
      Hi,

      recently had a project like this for some plants.

      Observations
      1) mitochondria are NOT easy to assemble. Expect 100+ contigs, perhaps ~400, with Illumina data. I had pacbio data which did not assemble to one contig, more like 30-60.

      2) mitochondria are highly variable in size, but as far as I know none are 20kb. See for example
      This resource organizes information on genomes including sequences, maps, chromosomes, assemblies, and annotations.


      3) as maubp suggested perhaps take these 13 genes as a nt fasta and map raw reads against them prior to assembly. Are they covered ?

      Maybe the data quality isn't good enough.

      cheers,
      Colin

      Comment


      • #4
        Re (3), I didn't explicitly suggest that, but its a good idea worth trying. You could also try mapping against mitochondrial sequences from the closest published relatives.

        Comment


        • #5
          Originally posted by maubp View Post
          You could also try mapping against mitochondrial sequences from the closest published relatives.
          I second this; you might be able to grab most of the mito reads that way. Alternately, I suggest you try normalizing the data prior to assembly, to drop the mito coverage down to a level similar to the rest of the genome - that makes it much easier to assemble, and typically yields a superior assembly for things with extremely high coverage.

          Comment


          • #6
            I also agree, first try to get the mitochondrial reads.

            We had exactly thee same problem when assembling our non-model organism (1Gb, mitochondrion ~16kb), and didn't get any mitochondrial contigs at all. It turned out that the tissue we used was so full of mitochondria that the read coverage was just too high for the assembler to handle it.

            Instad, we mapped all reads to the closest mithochondrion we could find, and extracted a consensus from our mapped reads. Then we mapped our reads again against the consensus, corrected it, mapped again and continued in an iterative manner until the reads and the consensus matched perfectly. I think the software MITObim (https://github.com/chrishah/MITObim) works in a similar way (it wasn't released when we did this so I haven't tried it myself).

            Good luck!

            Comment


            • #7
              Interesting. Good suggestions everybody.

              Just stumbled across this program as well, which looks decent:

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM
              • seqadmin
                Techniques and Challenges in Conservation Genomics
                by seqadmin



                The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

                Avian Conservation
                Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
                03-08-2024, 10:41 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, Yesterday, 06:37 PM
              0 responses
              10 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, Yesterday, 06:07 PM
              0 responses
              9 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-22-2024, 10:03 AM
              0 responses
              49 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 03-21-2024, 07:32 AM
              0 responses
              67 views
              0 likes
              Last Post seqadmin  
              Working...
              X