Unconfigured Ad

Collapse
X
 
  • Filter
  • Time
  • Show
Clear All
new posts
  • cement_head
    Senior Member
    • Mar 2012
    • 264

    SOLiD for Genomes

    Does SOLiD work well for genomes with a lot of repeats? Theoretically it should, but in practice?

    Thanks,
  • Chipper
    Senior Member
    • Mar 2008
    • 323

    #2
    No. Besides that it is obsolete it gave far too short reads.

    Comment

    • cement_head
      Senior Member
      • Mar 2012
      • 264

      #3
      Hello,

      It is not obsolete - Complete Genomics (BGI) use sequencing-by-ligation?

      URL: http://bgi-international.com/service...her-platforms/

      -Andor

      Comment

      • cmbetts
        Senior Member
        • Jun 2012
        • 120

        #4
        They may both use sequencing by ligation, but SOLiD and Complete Genomics are different technologies. As far as I can tell, SOLiD has been discontinued, having been beaten by Illumina and replace by Ion Torrent long ago.
        Either would still be inappropriate for de novo genome sequencing. Complete has always been exclusively for human genome resequencing, and the colorspace reads of SOLiD were best when a reference was available because sequencing errors introduced frameshifts in the base encoding.

        Comment

        • colindaven
          Senior Member
          • Oct 2008
          • 417

          #5
          There are still quite a few SOLiDs out there, see for example this data just into the SRA:

          http://www.ncbi.nlm.nih.gov/sra/ERX1488475[accn]

          Raw read accuracy is excellent, but keep in mind paired end reads do not really work at all (R1 was ~ 75 bp, 60bp after trimming, and R2 was just pure rubbish).

          A 60bp SE read is too short to place accurately in many/most genomes. Also de novo assembly simply does not work, which rules out all other than resequencing applications (you need a very good reference genome too).

          Comment

          • Brian Bushnell
            Super Moderator
            • Jan 2014
            • 2709

            #6
            My experience with Solid 4 was that it had terrible accuracy... on both read 1 and read 2.

            Comment

            • westerman
              Rick Westerman
              • Jun 2008
              • 1104

              #7
              Originally posted by colindaven View Post
              A 60bp SE read is too short to place accurately in many/most genomes.
              Going off the topic here (which is that the SOLiD is not good for denovo work) I wonder where you get that statement. It seems to me that 60 quality bases would be enough to place accurately except for long repeat regions (e.g., LTRs).

              Comment

              • colindaven
                Senior Member
                • Oct 2008
                • 417

                #8
                @westerman

                It wasn't clear from the start whether the topic was de novo or reference based assembly.

                Have a look at the genome mappability score which came out of Mike Schatz's lab as one example (http://bioinformatics.oxfordjournals...8/16/2097.full).

                Even with 100bp perfect simulated single reads there are regions which cannot be mapped to reliably. Therefore, 60 bp reads containing errors won't be so nice to deal with. I remember working on human twin genomes and getting ~40-50,000 differences in VCF despite various SNP callers and stringent mapping quality filters.



                By the way, I work on plant genomes, and repetitive regions can be > 80%, so I thought the original poster might have similar issues.

                Comment

                • RickC7
                  Member
                  • Feb 2010
                  • 31

                  #9
                  Reagent support for SOLiD until May2017 or sooner per demand.

                  We use/used SOLiD for SAGE, great for short reads but more expensive than Illumina runs. Converting everything over to Illumina adapters now...

                  The couple times we did targeted reseq or whole transciptome, reverse read quality was bad.

                  Comment

                  • cement_head
                    Senior Member
                    • Mar 2012
                    • 264

                    #10
                    Ok, thanks

                    Comment

                    • gringer
                      David Eccles (gringer)
                      • May 2011
                      • 845

                      #11
                      Originally posted by westerman View Post
                      Going off the topic here (which is that the SOLiD is not good for denovo work) I wonder where you get that statement. It seems to me that 60 quality bases would be enough to place accurately except for long repeat regions (e.g., LTRs).
                      I suspect I've discussed this with you previously, but I might as well say things I haven't said before:

                      Homopolymers look identical in colour-space, which causes havoc for transcriptome assemblies (e.g. distinguishing between poly-T and poly-A sequences). Other simple repeats would also cause issues for genomic assembly (e.g. ACACACACAC and GTGTGTGTGT are identical, despite having both a base shift and a complementation). The assemblies are only likely to be useful in colour-space, because colour-space errors propagate through as very different sequences in base-space. Also, every contig has four possible base-space representations, which among other things makes it quite difficult to use other genome assemblies as scaffolds for a colour-space assembly.

                      Comment

                      • cement_head
                        Senior Member
                        • Mar 2012
                        • 264

                        #12
                        Originally posted by gringer View Post
                        I suspect I've discussed this with you previously, but I might as well say things I haven't said before:

                        Homopolymers look identical in colour-space, which causes havoc for transcriptome assemblies (e.g. distinguishing between poly-T and poly-A sequences). Other simple repeats would also cause issues for genomic assembly (e.g. ACACACACAC and GTGTGTGTGT are identical, despite having both a base shift and a complementation). The assemblies are only likely to be useful in colour-space, because colour-space errors propagate through as very different sequences in base-space. Also, every contig has four possible base-space representations, which among other things makes it quite difficult to use other genome assemblies as scaffolds for a colour-space assembly.
                        I guess I still don't understand the "issues" with deconvoluting colour-space. It seems as though it would be much more accurate than sequencing in basespace (e.g. Illumina). That's if I'm reading this paper correctly (attached).
                        Attached Files

                        Comment

                        • gringer
                          David Eccles (gringer)
                          • May 2011
                          • 845

                          #13
                          Originally posted by cement_head View Post
                          It seems as though it would be much more accurate than sequencing in basespace (e.g. Illumina). That's if I'm reading this paper correctly.
                          If our preferred model of DNA were colour-space, then it might have been more accurate with sufficient technology development. As it is, Illumina has had plenty of opportunity to improve the accuracy of their technology, and benefits from their chemical model being almost a direct representation of the DNA model that we use for sequencing.

                          Comment

                          • Chipper
                            Senior Member
                            • Mar 2008
                            • 323

                            #14
                            Originally posted by cement_head View Post
                            I guess I still don't understand the "issues" with deconvoluting colour-space. It seems as though it would be much more accurate than sequencing in basespace (e.g. Illumina). That's if I'm reading this paper correctly (attached).
                            The quoted error rate (<0.1%) must be after reference-based correction. The problem with SOLiD was the high raw error rate of the ligation based chemistry (compared to Illumina) and the short read lengths which makes it essentially useless for de novo assembly.

                            I think the best option today for a large genome and a low budget would be to use the 10x Chromium with HiseqX (~$2000 for one lane PE150 linked reads from long fragments).

                            Comment

                            • cement_head
                              Senior Member
                              • Mar 2012
                              • 264

                              #15
                              Originally posted by Chipper View Post
                              The quoted error rate (<0.1%) must be after reference-based correction. The problem with SOLiD was the high raw error rate of the ligation based chemistry (compared to Illumina) and the short read lengths which makes it essentially useless for de novo assembly.

                              I think the best option today for a large genome and a low budget would be to use the 10x Chromium with HiseqX (~$2000 for one lane PE150 linked reads from long fragments).
                              So I took another look at this and it strikes me that the whole problem is the use of only four fluors for 16 combinations. (Seems odd that this wasn't the primary issue attempted to be solved; i.e generating 16 distinct fluors.) Once I got that part, it became obvious why there's an issue with colourspace. Curiously, I just found out that MiniSeq and NextSeq from Illumina use only two fluors - seems like a huge potential issue is one isn't resequencing a human genome...

                              Comment

                              Latest Articles

                              Collapse

                              • SEQadmin2
                                Nine Things a Sample Prep Scientist Thinks About Before Sequencing
                                by SEQadmin2


                                I’m not a sequencing expert. I’m a purification scientist who uses NGS to evaluate workflows my group develops. With this perspective, we think about the sample first and the NGS workflow second. The sequencer is an exceptionally honest reporter, but it can only report on what you give it, so whether you get clean, interpretable data from an NGS workflow is largely determined before you begin.


                                Here are nine questions we think about, in roughly the order they matter, before...
                                06-18-2026, 07:11 AM
                              • SEQadmin2
                                From Collection to Sequencing: Why Sample Preparation and Preservation Define Sequencing Data
                                by SEQadmin2


                                Data variability is still an issue in sequencing technologies despite the advances in reproducibility and accuracy of these platforms. But the problem does not originate in the sequencing itself, but in the previous steps, before the sample reaches the sequencer.


                                The first step is collection, followed by preservation and sample preparation for analysis. Most scientists overlook those steps, but not being careful might just be skewing the experiment’s results.
                                ...
                                06-02-2026, 10:05 AM

                              ad_right_rmr

                              Collapse

                              News

                              Collapse

                              Topics Statistics Last Post
                              Started by SEQadmin2, 06-17-2026, 06:09 AM
                              0 responses
                              24 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-09-2026, 11:58 AM
                              0 responses
                              42 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-05-2026, 10:09 AM
                              0 responses
                              48 views
                              0 reactions
                              Last Post SEQadmin2  
                              Started by SEQadmin2, 06-04-2026, 08:59 AM
                              0 responses
                              49 views
                              0 reactions
                              Last Post SEQadmin2  
                              Working...