Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • freestile
    Member
    • Aug 2014
    • 11

    Genome assembly and Rna-seq mapping

    I every one, is my first time here but I follow for many months. My question is if someone have experience with genome assembly and post mapping of ran-seq data.
    I use velvet, celera and clc to make de novo assembly of relative difficult bacteria, and i have relative goods statistics with velvet and celera (revised with published draft genomes). Then I check the assemblies mapping rna-seq data of the same organism and I obtain lows mapping percentages (near 50%) but when I check clc assembly this up to 90%. I try to optimise the clc assembly but if I drecrease the data or make more stringent trimming, the assembly decrease in quality.

    I use 250pb PE in miseq plataform.

    Best regards !
    Cristian.
  • SylvainL
    Senior Member
    • Feb 2012
    • 180

    #2
    Hi,

    for bacteria, I was using Edena http://http://www.genomic.ch/edena.php with good results

    Maybe you could give a try...
    Last edited by SylvainL; 10-31-2014, 05:38 AM.

    Comment

    • freestile
      Member
      • Aug 2014
      • 11

      #3
      Originally posted by SylvainL View Post
      Hi,

      for bacteria, I was using Edena http://http://www.genomic.ch/edena.php with good results

      Maybe you could give a try...
      Thxs for reply. I will try. I obtain high rna-seq reads mapping with a5 pipeline, but i can't increase genome assembly statistics.

      Comment

      • freestile
        Member
        • Aug 2014
        • 11

        #4
        Originally posted by SylvainL View Post
        Hi,

        for bacteria, I was using Edena http://http://www.genomic.ch/edena.php with good results

        Maybe you could give a try...
        I try Edena but i can't get the same length of my reads, If you can help me I appreciated.

        Comment

        • SylvainL
          Senior Member
          • Feb 2012
          • 180

          #5
          Hi,

          what do you mean by not getting the same length of your reads? In my case, when the Qscores were bad at the end of the reads, I had to trimm them... You can easily do this step with fastx toolkit (fastx_trimmer in this case) or seqtk (trimfq)...

          Let me know at which step you are stuck...

          Comment

          • freestile
            Member
            • Aug 2014
            • 11

            #6
            Originally posted by SylvainL View Post
            Hi,

            what do you mean by not getting the same length of your reads? In my case, when the Qscores were bad at the end of the reads, I had to trimm them... You can easily do this step with fastx toolkit (fastx_trimmer in this case) or seqtk (trimfq)...

            Let me know at which step you are stuck...
            The error is when start edena:

            Rapid file(s) examination... 158 220
            [err] All reads within a file must be the same length.

            I make pre-processing with bbmap. Maybe the problem is paired end data?

            Comment

            • fahmida
              Member
              • Aug 2010
              • 54

              #7
              I am also having the same error "[err] All reads within a file must be the same length". I am not sure I understand the rationale behind this. Once the raw reads e.g. 100 or 150nt length are trimmed for quality, adapters etc., read length becomes variable.

              Comment

              • GenoMax
                Senior Member
                • Feb 2008
                • 7142

                #8
                For bacterial assemblies SPAdes (http://bioinf.spbau.ru/spades) should be in your list of programs to try.

                Comment

                • freestile
                  Member
                  • Aug 2014
                  • 11

                  #9
                  I try spades, and really can't obtain good assembly in my specific data (I try with other bacteria data and I had got good results). This is the principal reason why I try others assemblers.

                  Comment

                  • SylvainL
                    Senior Member
                    • Feb 2012
                    • 180

                    #10
                    Originally posted by fahmida View Post
                    I am also having the same error "[err] All reads within a file must be the same length". I am not sure I understand the rationale behind this. Once the raw reads e.g. 100 or 150nt length are trimmed for quality, adapters etc., read length becomes variable.
                    In this case, I would advice you to look at the minimum length of your reads and trim all of them to have this minimum length, or depending of your FastQC report, I would go without adapter trimming... Really it is worthwhile trying Edena. As example for a total de novo assembly of Staphylococcus aureus, I got 12 contigs (which stopped because of the rRNA operons).

                    s.
                    Last edited by SylvainL; 11-05-2014, 11:41 PM.

                    Comment

                    • freestile
                      Member
                      • Aug 2014
                      • 11

                      #11
                      Thxs for your help. Finally I run edena, but I can't improve rna-seq mapping (~50%) comparing with clc, a5 and spades (~90%). I think that the next step is choice assembly with relative good statistics and at least ~90% rna-seq map.

                      Regards.

                      Comment

                      • SylvainL
                        Senior Member
                        • Feb 2012
                        • 180

                        #12
                        Originally posted by freestile View Post
                        Thxs for your help. Finally I run edena, but I can't improve rna-seq mapping (~50%) comparing with clc, a5 and spades (~90%). I think that the next step is choice assembly with relative good statistics and at least ~90% rna-seq map.

                        Regards.
                        When you say your rna-seq mapping is low do you mean only 50% of your rna-seq reads are mapped to your assembly? If yes, did you try a blast on some unmapped reads to see if they really come from your bacterium (or closed to)?

                        Another question: after rna-seq mapping to your asembly, do you have some regions which are not covered at all?

                        These RNAseq were performed with rRNA depletion? Do you have rRNA operons on your assembly?

                        Comment

                        • freestile
                          Member
                          • Aug 2014
                          • 11

                          #13
                          Originally posted by SylvainL View Post
                          When you say your rna-seq mapping is low do you mean only 50% of your rna-seq reads are mapped to your assembly? If yes, did you try a blast on some unmapped reads to see if they really come from your bacterium (or closed to)?

                          Another question: after rna-seq mapping to your asembly, do you have some regions which are not covered at all?

                          These RNAseq were performed with rRNA depletion? Do you have rRNA operons on your assembly?
                          Yes, only 50% and I blast some unmapped reads and correspond to bacteria genes.

                          I have to check te second question

                          And yes I make rRNA depletion in library preparation.

                          Regards !

                          Comment

                          Latest Articles

                          Collapse

                          • SEQadmin2
                            From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                            by SEQadmin2


                            Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                            The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                            ...
                            06-02-2026, 10:05 AM
                          • SEQadmin2
                            Single-Cell Sequencing at an Inflection Point: Early Impacts of New Platforms and Emerging Trends
                            by SEQadmin2


                            With the launch of new single-cell sequencing platforms in 2026, the field stands at an exciting inflection point. This article surveys the most impactful advances in the field and discusses how they’re reshaping research in cancer, immunology, and beyond.


                            Introduction

                            Single-cell sequencing technologies have undergone remarkable advances over the past decade, transitioning from low-throughput experimental approaches to highly scalable platforms capable of...
                            05-22-2026, 06:42 AM
                          • SEQadmin2
                            Environmental Genomics in the Age of NGS: From Microbes to Conservation Strategies
                            by SEQadmin2

                            Studying ecosystems means dealing with complex, multi-species communities that are hard to observe at scale. This complexity, however, hides many important questions to be answered, from how biogeochemical cycles work and how climate change can affect species distribution to how conservation strategies can work best.


                            Genomics, particularly since the expansion of NGS, has transformed ecosystem ecology. By sequencing environmental DNA, we can now assess biodiversity without direct...
                            05-06-2026, 09:04 AM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by SEQadmin2, Today, 08:59 AM
                          0 responses
                          7 views
                          0 reactions
                          Last Post SEQadmin2  
                          Started by SEQadmin2, 06-02-2026, 12:03 PM
                          0 responses
                          21 views
                          0 reactions
                          Last Post SEQadmin2  
                          Started by SEQadmin2, 06-02-2026, 11:40 AM
                          0 responses
                          14 views
                          0 reactions
                          Last Post SEQadmin2  
                          Started by SEQadmin2, 05-28-2026, 11:40 AM
                          0 responses
                          29 views
                          0 reactions
                          Last Post SEQadmin2  
                          Working...