After getting the license problems fixed (see my other post) and learning about the new Lifescope CLI and GUI (more on this later), I was finally able to process two recent Bioscope (BS) runs via Lifescope (LS). The results are mixed however this may due to the types of runs -- neither is 'perfect' -- which makes the "truth" a slippery subject.
The first project is an E.coli-based paired-end (50/35 bp) whole-transcriptome project with filtering against an rRNA database. The sample had a lot of rRNA in it with a consequent low mapping rate to the transcriptome.
The second is Honeybee-based 50bp fragment sequencing without filtering.
As far as mapping and the E. coli project, there were about 5% more reads mapped with Lifescope however about 20% fewer 'properly paired' reads and many more singletons. I am not sure why this is so. Perhaps the rRNA filtering?
For mapping and honeybee, the total number mapped reads for Lifescope vs. Bioscope were within 0.1% of each other.
As far as the interesting-to-the-customer results, for E. coli WT the interesting results are the tag counts of the exons. Lifescope consistently had the same or lower counts than Bioscope. Part of this may be due to 20% fewer paired reads and part of this may be due to Lifescope taking mapping qualities into account. In any case the consistency of the numbers was good -- I would have not liked to see wild swings between LS and BS.
For the honeybee project, the most interesting results are SNPs and small InDels. Unfortunately our customer wanted SNPs called at the 'low stringency' cutoff which, in my opinion, causes a lot of false positives. And, of course, the honeybee genome is not 100% done which can cause additional problems. Bioscope came up with ~60,000 SNPS while Lifescope came up with only 30,000. Comparing the two there were about 22,000 of these SNPs in common. So it is hard to say which program is better.
Overall, as I said, the results are mixed. It is hard to say conclusively that Lifescope provides more accurate results than Bioscope. I really need to use projects that utilize model organisms with really clean data sets. Alas, in our line of work, these are few and far apart. I will post more if and when I get back to Lifescope in a non-production-let's-play-with-it moment. At the moment the new Casava (1.8) is calling to me. Then after that our SOLiD 5500 will be on-line and it will be back to production work for me.
The first project is an E.coli-based paired-end (50/35 bp) whole-transcriptome project with filtering against an rRNA database. The sample had a lot of rRNA in it with a consequent low mapping rate to the transcriptome.
The second is Honeybee-based 50bp fragment sequencing without filtering.
As far as mapping and the E. coli project, there were about 5% more reads mapped with Lifescope however about 20% fewer 'properly paired' reads and many more singletons. I am not sure why this is so. Perhaps the rRNA filtering?
For mapping and honeybee, the total number mapped reads for Lifescope vs. Bioscope were within 0.1% of each other.
As far as the interesting-to-the-customer results, for E. coli WT the interesting results are the tag counts of the exons. Lifescope consistently had the same or lower counts than Bioscope. Part of this may be due to 20% fewer paired reads and part of this may be due to Lifescope taking mapping qualities into account. In any case the consistency of the numbers was good -- I would have not liked to see wild swings between LS and BS.
For the honeybee project, the most interesting results are SNPs and small InDels. Unfortunately our customer wanted SNPs called at the 'low stringency' cutoff which, in my opinion, causes a lot of false positives. And, of course, the honeybee genome is not 100% done which can cause additional problems. Bioscope came up with ~60,000 SNPS while Lifescope came up with only 30,000. Comparing the two there were about 22,000 of these SNPs in common. So it is hard to say which program is better.
Overall, as I said, the results are mixed. It is hard to say conclusively that Lifescope provides more accurate results than Bioscope. I really need to use projects that utilize model organisms with really clean data sets. Alas, in our line of work, these are few and far apart. I will post more if and when I get back to Lifescope in a non-production-let's-play-with-it moment. At the moment the new Casava (1.8) is calling to me. Then after that our SOLiD 5500 will be on-line and it will be back to production work for me.
Comment