Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • Improvements to assembled genome

    Hi all,

    I would like to ask you how would you perform the following steps in order to improve the quality of an assembly:
    • homopolymer correction
    • editing of contigs based on genetic map data

    Please let me know if there is any other approach you would apply and which steps would you take for it.

    Thanks in advance!

  • #2
    Originally posted by NPalopoli View Post
    Hi all,

    I would like to ask you how would you perform the following steps in order to improve the quality of an assembly:
    • homopolymer correction
    • editing of contigs based on genetic map data

    Please let me know if there is any other approach you would apply and which steps would you take for it.

    Thanks in advance!
    Homopolymer correction can be done by applying a low-complexity filter to your data before analysis, and there are two approaches implemented in prinseq. Another common approach is to incorporate some Illumina data and 454 data into an assembly to correct for homopolymers, among other things.

    As for the second point, it is not clear what you mean by edit. You can certainly order your contigs using marker data if your contigs are sufficiently large and you have an ample amount of unambiguously mapping markers (good luck without a physical map). You can also identify assembly artifacts using your map, but your power to detect anything would depend on your resources (how complete your genome is, how good your map is, etc.).

    Comment


    • #3
      Thank you very much SAS for your answer.

      The program you pointed to, prinseq, seems like an excellent choice not only for homopolymer correction but for general quality control tasks. I will definitely give it a try.

      By editing contigs I was referring to exactly what you talk about: take information from physical data or other experimental analyses and use it to improve the quality of your contigs. Your approach seems good, but could you tell me about which programs or procedures would you follow to integrate the information?

      The question should still remain open for everyone to post their best solution for these problems.

      Comment


      • #4
        Originally posted by NPalopoli View Post
        By editing contigs I was referring to exactly what you talk about: take information from physical data or other experimental analyses and use it to improve the quality of your contigs. Your approach seems good, but could you tell me about which programs or procedures would you follow to integrate the information?
        A direct approach would be to use CMap, along with a gbrowse adaptor to display your markers in your contigs. This is far too broad a subject for me to recommend a canned approach, and as I stated before, whatever approach you take would depend on the resources you are working with.

        Comment


        • #5
          Excellent view on the subject, SES, thanks a lot.

          It seems it is not a matter of choosing the right parameters for a program, but the whole approach should be tailored to the needs. CPAN is always a good place to look at but my first guess would have been that any other direct and easier to implement alternatives will be out there. I will have to take the time to review more literature and ask for further guidance when the time comes.

          May this thread be open to any other perspectives on the subject.

          Comment


          • #6
            The approach I mentioned previously was really for displaying markers, but could probably be adapted for your needs. For ordering contigs on a genome scale, the normal approach is to construct a physical map based on restriction patterns (using FPC). On a small scale, you can use Consed to incorporate restriction data to order contigs, but I have not personally used that feature.

            Comment


            • #7
              A physical map based on restriction patterns (or SNPs, micro/mini satellites, etc) is always of help. But I didn't know about Consed, that's a nice reccomendation.

              Comment

              Latest Articles

              Collapse

              • seqadmin
                Current Approaches to Protein Sequencing
                by seqadmin


                Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
                04-04-2024, 04:25 PM
              • seqadmin
                Strategies for Sequencing Challenging Samples
                by seqadmin


                Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
                03-22-2024, 06:39 AM

              ad_right_rmr

              Collapse

              News

              Collapse

              Topics Statistics Last Post
              Started by seqadmin, 04-11-2024, 12:08 PM
              0 responses
              22 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 10:19 PM
              0 responses
              24 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-10-2024, 09:21 AM
              0 responses
              19 views
              0 likes
              Last Post seqadmin  
              Started by seqadmin, 04-04-2024, 09:00 AM
              0 responses
              50 views
              0 likes
              Last Post seqadmin  
              Working...
              X