Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • freestile
    Member
    • Aug 2014
    • 11

    Genome assembly and Rna-seq mapping

    I every one, is my first time here but I follow for many months. My question is if someone have experience with genome assembly and post mapping of ran-seq data.
    I use velvet, celera and clc to make de novo assembly of relative difficult bacteria, and i have relative goods statistics with velvet and celera (revised with published draft genomes). Then I check the assemblies mapping rna-seq data of the same organism and I obtain lows mapping percentages (near 50%) but when I check clc assembly this up to 90%. I try to optimise the clc assembly but if I drecrease the data or make more stringent trimming, the assembly decrease in quality.

    I use 250pb PE in miseq plataform.

    Best regards !
    Cristian.
  • SylvainL
    Senior Member
    • Feb 2012
    • 180

    #2
    Hi,

    for bacteria, I was using Edena http://http://www.genomic.ch/edena.php with good results

    Maybe you could give a try...
    Last edited by SylvainL; 10-31-2014, 05:38 AM.

    Comment

    • freestile
      Member
      • Aug 2014
      • 11

      #3
      Originally posted by SylvainL View Post
      Hi,

      for bacteria, I was using Edena http://http://www.genomic.ch/edena.php with good results

      Maybe you could give a try...
      Thxs for reply. I will try. I obtain high rna-seq reads mapping with a5 pipeline, but i can't increase genome assembly statistics.

      Comment

      • freestile
        Member
        • Aug 2014
        • 11

        #4
        Originally posted by SylvainL View Post
        Hi,

        for bacteria, I was using Edena http://http://www.genomic.ch/edena.php with good results

        Maybe you could give a try...
        I try Edena but i can't get the same length of my reads, If you can help me I appreciated.

        Comment

        • SylvainL
          Senior Member
          • Feb 2012
          • 180

          #5
          Hi,

          what do you mean by not getting the same length of your reads? In my case, when the Qscores were bad at the end of the reads, I had to trimm them... You can easily do this step with fastx toolkit (fastx_trimmer in this case) or seqtk (trimfq)...

          Let me know at which step you are stuck...

          Comment

          • freestile
            Member
            • Aug 2014
            • 11

            #6
            Originally posted by SylvainL View Post
            Hi,

            what do you mean by not getting the same length of your reads? In my case, when the Qscores were bad at the end of the reads, I had to trimm them... You can easily do this step with fastx toolkit (fastx_trimmer in this case) or seqtk (trimfq)...

            Let me know at which step you are stuck...
            The error is when start edena:

            Rapid file(s) examination... 158 220
            [err] All reads within a file must be the same length.

            I make pre-processing with bbmap. Maybe the problem is paired end data?

            Comment

            • fahmida
              Member
              • Aug 2010
              • 54

              #7
              I am also having the same error "[err] All reads within a file must be the same length". I am not sure I understand the rationale behind this. Once the raw reads e.g. 100 or 150nt length are trimmed for quality, adapters etc., read length becomes variable.

              Comment

              • GenoMax
                Senior Member
                • Feb 2008
                • 7142

                #8
                For bacterial assemblies SPAdes (http://bioinf.spbau.ru/spades) should be in your list of programs to try.

                Comment

                • freestile
                  Member
                  • Aug 2014
                  • 11

                  #9
                  I try spades, and really can't obtain good assembly in my specific data (I try with other bacteria data and I had got good results). This is the principal reason why I try others assemblers.

                  Comment

                  • SylvainL
                    Senior Member
                    • Feb 2012
                    • 180

                    #10
                    Originally posted by fahmida View Post
                    I am also having the same error "[err] All reads within a file must be the same length". I am not sure I understand the rationale behind this. Once the raw reads e.g. 100 or 150nt length are trimmed for quality, adapters etc., read length becomes variable.
                    In this case, I would advice you to look at the minimum length of your reads and trim all of them to have this minimum length, or depending of your FastQC report, I would go without adapter trimming... Really it is worthwhile trying Edena. As example for a total de novo assembly of Staphylococcus aureus, I got 12 contigs (which stopped because of the rRNA operons).

                    s.
                    Last edited by SylvainL; 11-05-2014, 11:41 PM.

                    Comment

                    • freestile
                      Member
                      • Aug 2014
                      • 11

                      #11
                      Thxs for your help. Finally I run edena, but I can't improve rna-seq mapping (~50%) comparing with clc, a5 and spades (~90%). I think that the next step is choice assembly with relative good statistics and at least ~90% rna-seq map.

                      Regards.

                      Comment

                      • SylvainL
                        Senior Member
                        • Feb 2012
                        • 180

                        #12
                        Originally posted by freestile View Post
                        Thxs for your help. Finally I run edena, but I can't improve rna-seq mapping (~50%) comparing with clc, a5 and spades (~90%). I think that the next step is choice assembly with relative good statistics and at least ~90% rna-seq map.

                        Regards.
                        When you say your rna-seq mapping is low do you mean only 50% of your rna-seq reads are mapped to your assembly? If yes, did you try a blast on some unmapped reads to see if they really come from your bacterium (or closed to)?

                        Another question: after rna-seq mapping to your asembly, do you have some regions which are not covered at all?

                        These RNAseq were performed with rRNA depletion? Do you have rRNA operons on your assembly?

                        Comment

                        • freestile
                          Member
                          • Aug 2014
                          • 11

                          #13
                          Originally posted by SylvainL View Post
                          When you say your rna-seq mapping is low do you mean only 50% of your rna-seq reads are mapped to your assembly? If yes, did you try a blast on some unmapped reads to see if they really come from your bacterium (or closed to)?

                          Another question: after rna-seq mapping to your asembly, do you have some regions which are not covered at all?

                          These RNAseq were performed with rRNA depletion? Do you have rRNA operons on your assembly?
                          Yes, only 50% and I blast some unmapped reads and correspond to bacteria genes.

                          I have to check te second question

                          And yes I make rRNA depletion in library preparation.

                          Regards !

                          Comment

                          Latest Articles

                          Collapse

                          • SEQadmin2
                            Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                            by SEQadmin2


                            I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.

                            Here are nine questions we think about, in roughly the order they matter, before...
                            06-18-2026, 07:11 AM
                          • SEQadmin2
                            From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                            by SEQadmin2


                            Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                            The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                            ...
                            06-02-2026, 10:05 AM

                          ad_right_rmr

                          Collapse

                          News

                          Collapse

                          Topics Statistics Last Post
                          Started by SEQadmin2, 06-26-2026, 11:10 AM
                          0 responses
                          10 views
                          0 reactions
                          Last Post SEQadmin2  
                          Started by SEQadmin2, 06-17-2026, 06:09 AM
                          0 responses
                          44 views
                          0 reactions
                          Last Post SEQadmin2  
                          Started by SEQadmin2, 06-09-2026, 11:58 AM
                          0 responses
                          104 views
                          0 reactions
                          Last Post SEQadmin2  
                          Started by SEQadmin2, 06-05-2026, 10:09 AM
                          0 responses
                          125 views
                          0 reactions
                          Last Post SEQadmin2  
                          Working...