I've used Bioscope 1.1 and now have Bioscope 1.2. They did away with a lot of the temporary files. I haven't noticed much improvement otherwise. I got a few of my RNA samples to run but half of them crashed. When I restarted the pipeline it finished, so I'm not exactly sure why, but I suspect NFS delays.
I have a ChIP dataset that I tried to run through Bioscope and it flat out failed. ABI recommended I continue using the old version of Bioscope until they have a fix...over a week now.
At this point, I'm not using Bioscope anymore. It looks like BWA or BFAST for color space reads.
Seqanswers Leaderboard Ad
Collapse
Announcement
Collapse
No announcement yet.
X
-
Bioscope sounds like a complex system that is memory hungry and CPU intensive. Although I must say that I've used corona-lite in the past and that seemed to be a lot more difficult to work with especially with the amount of computational time required for the alignment stage.
We've been developing a new aligner for AB colorspace, novoalignCS, featuring
1. Mate-pair alignment (F3 & R3) of csfasta/csfastq. If reads are in bead order for F3/R3 mates then pairs are identified and mapped accordingly.
2. Gapped alignment by default with mismatches.
3. SAM output (supporting RG). We've been using samtools and Picard to validate our SAM records.
4. Requires < 10Gb for matching against human/mouse/chimp,etc
5. Multithreaded (and MPI cluster-aware in the near future)
6. Polyclonal and color error filtering based on the SOPRA method (Sasson & Micheal, 2010).
7. Calculates mate-pair fragment length distribution given an initial distribution e.g 5K library with SD=500.
We are still busy with testing and comparison to other aligners i.e. BFAST, BWA. Although at this point we do welcome feedback from beta testers. If anybody is interested in obtaining a version please PM me or visit our site.
Leave a comment:
-
Originally posted by clariet View PostJust saw this post. We were able to use a Whole-transcriptome pipeline of BioScope (1.0.1-42) on a RNA-seq dataset. And a note about its mapping statistics. I confirmed with their specialists that the current version of BS has bug on those numbers. so it will be fixed in next release, hopefully very soon.
We have a feeling that a large proportion of reads are wasted for SOLiD data compared to Solexa. For example, for a current chip-seq dataset, we have seen a average of 80M reads generated for a sample (quad). However, after filtering of low quality alignment and non-unique hits, only ~4% of reads could be used for further peak detection. Has anyone have similar experience? Does this sound normal?
I'm currently using BioScope v1.0.1. May I know what the bug on the statistics is?
Leave a comment:
-
Haven't had success with the 8 GB nodes yet. Will keep trying as time permits.
I'll agree that bioscope does not handle temporary network glitches. While these should not occur I find that my disk appliance and network does get overwhelmed -- rarely but certainly -- with lots of SOLiD processes it. To the point where a request gets shunted off to the side and Bioscope goes belly up. :-( It is not that hard to write software that can handle temporary glitches. One retry is all I am asking for.
Leave a comment:
-
Originally posted by westerman View PostI have two clusters. One has 8 machines, 16 CPUs each, 128 GB memory each, all connected to a fast disk. However I can only run the command-line Bioscope on it. With that much machine power I do not worry about running out of memory.
My other cluster also has 8 machines. 4 with 4 CPUs and 8 GB memory each and the other 4 with 8 CPUs and 32 GB memory each. I have been trying to run WT-bioscope on these machines but with less success. I am running out of memory plus sometimes getting kernel warnings. My current parameters are:
mapping.np.per.node=4
mapping.number.of.nodes=10
mapping.memory.size=3
In other words 4 CPUs per node and 10 nodes (I am making my 8-cpu machines into 2 nodes each so, in theory, I should have 12 nodes but I wanted to leave some processing power free).
The memory parameter is 3 GB but I am unsure what this really means. Does bioscope start up 4-cpu jobs on a node using only 3 GB? Or does bioscope start up 4 1-cpu jobs on a node using 3 times 4 GB of memory? It appears to do the latter since my 8 GB machines have to use virtual memory at times.
I really hesitate to go below 3 GB since my genome reference size is ~2 GBases. As far as I can tell Bioscope is chopping up the matching portion of its pipeline into many small chunks in order to accommodate this small memory allocation.
Anyway I would say that the more memory you have then the better off you are. It makes sense to run fewer jobs with lots of memory than many jobs each starved for memory.
Once I get Bioscope running on my small cluster using all 8 machines then I will try it out on the small cluster using just the 4 large memory machines. Our small cluster is sort of a 'recycled' cluster (e.g. some of the machines were given to us) and we would like to use it if possible. I hate to think that a 4-cpu, 8-GB machine is just so much junk and thus we should re-gift it but, for Bioscope at least, it appears that those machines may indeed be worthless.
I have been wrestling with ABI to try to make it work but they are less responsive when told I am working with 8 GB machines.
Just trying to map mouse transcriptome reads at this time and so far the 'big' memory jobs complete.
its the small 2 Gb jobs that fail possibly cos of temporary network glitches which bioscope isn't written to handle and I was advised to restart the job
Do drop me a pm or a reply here if u get the 8 GB machines working.
else i think they would be good enough for BWA or bowtie mapping.
Leave a comment:
-
I have two clusters. One has 8 machines, 16 CPUs each, 128 GB memory each, all connected to a fast disk. However I can only run the command-line Bioscope on it. With that much machine power I do not worry about running out of memory.
My other cluster also has 8 machines. 4 with 4 CPUs and 8 GB memory each and the other 4 with 8 CPUs and 32 GB memory each. I have been trying to run WT-bioscope on these machines but with less success. I am running out of memory plus sometimes getting kernel warnings. My current parameters are:
mapping.np.per.node=4
mapping.number.of.nodes=10
mapping.memory.size=3
In other words 4 CPUs per node and 10 nodes (I am making my 8-cpu machines into 2 nodes each so, in theory, I should have 12 nodes but I wanted to leave some processing power free).
The memory parameter is 3 GB but I am unsure what this really means. Does bioscope start up 4-cpu jobs on a node using only 3 GB? Or does bioscope start up 4 1-cpu jobs on a node using 3 times 4 GB of memory? It appears to do the latter since my 8 GB machines have to use virtual memory at times.
I really hesitate to go below 3 GB since my genome reference size is ~2 GBases. As far as I can tell Bioscope is chopping up the matching portion of its pipeline into many small chunks in order to accommodate this small memory allocation.
Anyway I would say that the more memory you have then the better off you are. It makes sense to run fewer jobs with lots of memory than many jobs each starved for memory.
Once I get Bioscope running on my small cluster using all 8 machines then I will try it out on the small cluster using just the 4 large memory machines. Our small cluster is sort of a 'recycled' cluster (e.g. some of the machines were given to us) and we would like to use it if possible. I hate to think that a 4-cpu, 8-GB machine is just so much junk and thus we should re-gift it but, for Bioscope at least, it appears that those machines may indeed be worthless.
Leave a comment:
-
Originally posted by westerman View PostYes. I can run fragment to diBayes calls, pairing to diBayes calls and most recently the whole transcriptome calling. All command line. It has been a bear to get running smoothly since the assumptions that the various pipelines run under seem to be different.
Leave a comment:
-
Just saw this post. We were able to use a Whole-transcriptome pipeline of BioScope (1.0.1-42) on a RNA-seq dataset. And a note about its mapping statistics. I confirmed with their specialists that the current version of BS has bug on those numbers. so it will be fixed in next release, hopefully very soon.
We have a feeling that a large proportion of reads are wasted for SOLiD data compared to Solexa. For example, for a current chip-seq dataset, we have seen a average of 80M reads generated for a sample (quad). However, after filtering of low quality alignment and non-unique hits, only ~4% of reads could be used for further peak detection. Has anyone have similar experience? Does this sound normal?
Leave a comment:
-
That is great news that you were able to get it running on your own data. Thanks for the quick post. I am in the process of getting up running on our data too. Just wanted to know if someone has been successful.
Leave a comment:
-
Yes. I can run fragment to diBayes calls, pairing to diBayes calls and most recently the whole transcriptome calling. All command line. It has been a bear to get running smoothly since the assumptions that the various pipelines run under seem to be different.
Leave a comment:
-
Sorry a little late to this discussion, but has anyone been able to get Bioscope to run on their own data? (that is the CL version BioScope-1.0.1-42)
Leave a comment:
-
I think that ABI/LifeTech are still just releasing the bioscope software on an 'as-need' basis until such time as they have it in releasable form. The last public link that have at the web site is for SAET. I do not see a public mention of bioscope.
Leave a comment:
-
-
Originally posted by skblazer View PostExcuse me, where can I download Bioscope
Leave a comment:
-
Originally posted by drio View PostI have been using it for re-sequencing (still testing). BS bundles a bunch of different experiment, WT among them. I would say, download the software and start by
running the examples that come with it. Once you have it up and running modify it
to work with your data and test it out.
ABi supports SGE and PBS. THe installation in non-root mode is not extremely invasive so I would suggest you start by that.
Let us know how it goes.
Leave a comment:
Latest Articles
Collapse
-
by seqadmin
The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health1. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...-
Channel: Articles
02-24-2025, 06:31 AM -
ad_right_rmr
Collapse
News
Collapse
Topics | Statistics | Last Post | ||
---|---|---|---|---|
Started by seqadmin, 03-03-2025, 01:15 PM
|
0 responses
162 views
0 likes
|
Last Post
by seqadmin
03-03-2025, 01:15 PM
|
||
Started by seqadmin, 02-28-2025, 12:58 PM
|
0 responses
251 views
0 likes
|
Last Post
by seqadmin
02-28-2025, 12:58 PM
|
||
Started by seqadmin, 02-24-2025, 02:48 PM
|
0 responses
625 views
0 likes
|
Last Post
by seqadmin
02-24-2025, 02:48 PM
|
||
Started by seqadmin, 02-21-2025, 02:46 PM
|
0 responses
265 views
0 likes
|
Last Post
by seqadmin
02-21-2025, 02:46 PM
|
Leave a comment: