Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Originally posted by rskr View Post
    So, what is the difference between that and looking at a pileup of paired end Illumina reads? They will get the linkage just as well.
    Well except 454, would still fail to find linkage over 400bp consistently since the median read is much less, and with sufficient coverage paired end data is likely to find linkage up to 800bp which is maximum length PCR product.

    Comment


    • #17
      Originally posted by rskr View Post
      And you can do that with the 454 error model? Last I checked a mean quality of 30 would guarantee several errors in a 200-400bp read? Might be better off With a 250 base insert size and 150 bp paired end reads, with an overlapper that finds the intersection.
      Hard to estimate total number of miscalls in a read from a mean quality value. But Q30 is one error per 1000 bases. So, as long as you don't have crazy high quality values off setting really low values, then I would not expect several errors in a 200-400bp read.

      Also, to the extent that the quality values are accurate, software could use them to weight the likelihood of a given base being a true variation or not. Or, trivially, you could mask out bases that had quality values lower than 30.

      Let's not ignore the elephant here: Illumina is producing 100's of gigabases of sequence per flow cell whereas a 454 run produces 100's of megabases. Illumina chemistry has a higher per run cost than 454, but we are still looking at something approaching a 100x price per base differential.

      But the same logic applies to Sanger sequencing, which is at least 100x more expensive per base.

      --
      Phillip

      Comment


      • #18
        Originally posted by rskr View Post
        Well except 454, would still fail to find linkage over 400bp consistently since the median read is much less, and with sufficient coverage paired end data is likely to find linkage up to 800bp which is maximum length PCR product.
        Hmm, I think you may just be trolling here.
        If all goes well, a 454 run will have median read lengths >400 bases.
        --
        Phillip

        Comment


        • #19
          Originally posted by pmiguel View Post
          Hard to estimate total number of miscalls in a read from a mean quality value. But Q30 is one error per 1000 bases. So, as long as you don't have crazy high quality values off setting really low values, then I would not expect several errors in a 200-400bp read.

          Also, to the extent that the quality values are accurate, software could use them to weight the likelihood of a given base being a true variation or not. Or, trivially, you could mask out bases that had quality values lower than 30.

          Let's not ignore the elephant here: Illumina is producing 100's of gigabases of sequence per flow cell whereas a 454 run produces 100's of megabases. Illumina chemistry has a higher per run cost than 454, but we are still looking at something approaching a 100x price per base differential.

          But the same logic applies to Sanger sequencing, which is at least 100x more expensive per base.

          --
          Phillip
          Actually, in my experience it does seem to be that most of the errors are concentrated in a few of the reads. I can't explain why that is, but nevertheless that seems to be the case. So, most of the reads are perfect and a few are riddled with errors. It's not usually difficult to find those with the errors and discard them, either.

          It's true that 454 is less cost effective than Illumina. Most applications can use the shorter read lengths obtained from Illumina/Solid, etc., and for those applications it makes a lot more sense to use those technologies. One thing to keep in mind, however, when comparing the amount of data produced--454 doesn't produce as much data, but in many cases, you don't need as much data with 454, either. Simply comparing numbers doesn't tell the whole story. RNA-seq provides an excellent example of where one technology might be better than the other, depending on your experiment. If you're trying to quantify gene expression, Illumina is definitely the way to. In that case, you're just trying to identify transcripts and count them. The high number of reads is a boon to your experiment. However, if you're looking for splice variants and don't care so much to quantify expression, 454 is probably a better technology. There will always be a niche for 454, although it's never going to be large.

          Comment


          • #20
            Originally posted by ajthomas View Post
            However, if you're looking for splice variants and don't care so much to quantify expression, 454 is probably a better technology. There will always be a niche for 454, although it's never going to be large.
            Possible statistical fallacy there, since it doesn't sound like you have actually done any Illumina assemblies(generalizing from your one 454 machine). As it turns out paired end data is pretty good at finding splice variants, which I wasn't expecting, but we did a transcriptome assembly then ran PASA on it, and it found plenty of legitimate splice variants(with high alignment coverage). There again the length of the read doesn't matter what matters is the insert size during PCR. I guess what annoys me is when scientists do 454 transcriptome assemblies, then try to correct errors with Illumina data. When the Illumina paired end data does a superior job of transcriptome assembly in the first place. I have had this happen a number of times and compared the builds(454 vs. Illumina vs. 454+Illumina),and 454 was not as good, even had collaborators remark. I wish it weren't this way, but 454 really isn't competitive at that. The problem is to get the raw number of transcripts up, there end up being very many contigs that consist only of one 454 read, and well, since 454 reads aren't very accurate by and of themselves, it turns out to be a fairly inaccurate assembly. Newbler does use a bunch of isotigs, which essentially amounts to adding a bunch of duplicates of highly covered genes back to the assembly, I wish they wouldn't do this, to pad their N50 but they do. I don't think every isotig is a legitimate splice variant as a bifurcation in the graph that isn't resolved, so if you do some sort of clustering analysis, Newbler actually turns out to generate many fewer contigs than similar Illumina based assemblies.

            Comment

            Latest Articles

            Collapse

            • seqadmin
              Strategies for Sequencing Challenging Samples
              by seqadmin


              Despite advancements in sequencing platforms and related sample preparation technologies, certain sample types continue to present significant challenges that can compromise sequencing results. Pedro Echave, Senior Manager of the Global Business Segment at Revvity, explained that the success of a sequencing experiment ultimately depends on the amount and integrity of the nucleic acid template (RNA or DNA) obtained from a sample. “The better the quality of the nucleic acid isolated...
              03-22-2024, 06:39 AM
            • seqadmin
              Techniques and Challenges in Conservation Genomics
              by seqadmin



              The field of conservation genomics centers on applying genomics technologies in support of conservation efforts and the preservation of biodiversity. This article features interviews with two researchers who showcase their innovative work and highlight the current state and future of conservation genomics.

              Avian Conservation
              Matthew DeSaix, a recent doctoral graduate from Kristen Ruegg’s lab at The University of Colorado, shared that most of his research...
              03-08-2024, 10:41 AM

            ad_right_rmr

            Collapse

            News

            Collapse

            Topics Statistics Last Post
            Started by seqadmin, 03-27-2024, 06:37 PM
            0 responses
            12 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-27-2024, 06:07 PM
            0 responses
            11 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-22-2024, 10:03 AM
            0 responses
            53 views
            0 likes
            Last Post seqadmin  
            Started by seqadmin, 03-21-2024, 07:32 AM
            0 responses
            69 views
            0 likes
            Last Post seqadmin  
            Working...
            X