AB SOLiD Bioscope - SEQanswers

golharam replied

06-02-2010, 12:12 PM
I've used Bioscope 1.1 and now have Bioscope 1.2. They did away with a lot of the temporary files. I haven't noticed much improvement otherwise. I got a few of my RNA samples to run but half of them crashed. When I restarted the pipeline it finished, so I'm not exactly sure why, but I suspect NFS delays.

I have a ChIP dataset that I tried to run through Bioscope and it flat out failed. ABI recommended I continue using the old version of Bioscope until they have a fix...over a week now.

At this point, I'm not using Bioscope anymore. It looks like BWA or BFAST for color space reads.
Leave a comment:
zee replied

06-02-2010, 02:14 AM
Bioscope sounds like a complex system that is memory hungry and CPU intensive. Although I must say that I've used corona-lite in the past and that seemed to be a lot more difficult to work with especially with the amount of computational time required for the alignment stage.

We've been developing a new aligner for AB colorspace, novoalignCS, featuring

1. Mate-pair alignment (F3 & R3) of csfasta/csfastq. If reads are in bead order for F3/R3 mates then pairs are identified and mapped accordingly.
2. Gapped alignment by default with mismatches.
3. SAM output (supporting RG). We've been using samtools and Picard to validate our SAM records.
4. Requires < 10Gb for matching against human/mouse/chimp,etc
5. Multithreaded (and MPI cluster-aware in the near future)
6. Polyclonal and color error filtering based on the SOPRA method (Sasson & Micheal, 2010).
7. Calculates mate-pair fragment length distribution given an initial distribution e.g 5K library with SD=500.

We are still busy with testing and comparison to other aligners i.e. BFAST, BWA. Although at this point we do welcome feedback from beta testers. If anybody is interested in obtaining a version please PM me or visit our site.
Leave a comment:
Haneko replied

06-01-2010, 05:27 PM
Originally posted by clariet View Post

Just saw this post. We were able to use a Whole-transcriptome pipeline of BioScope (1.0.1-42) on a RNA-seq dataset. And a note about its mapping statistics. I confirmed with their specialists that the current version of BS has bug on those numbers. so it will be fixed in next release, hopefully very soon.

We have a feeling that a large proportion of reads are wasted for SOLiD data compared to Solexa. For example, for a current chip-seq dataset, we have seen a average of 80M reads generated for a sample (quad). However, after filtering of low quality alignment and non-unique hits, only ~4% of reads could be used for further peak detection. Has anyone have similar experience? Does this sound normal?

Hi there,

I'm currently using BioScope v1.0.1. May I know what the bug on the statistics is?
Leave a comment:
westerman replied

04-01-2010, 10:37 AM
Haven't had success with the 8 GB nodes yet. Will keep trying as time permits.

I'll agree that bioscope does not handle temporary network glitches. While these should not occur I find that my disk appliance and network does get overwhelmed -- rarely but certainly -- with lots of SOLiD processes it. To the point where a request gets shunted off to the side and Bioscope goes belly up. :-( It is not that hard to write software that can handle temporary glitches. One retry is all I am asking for.
Leave a comment:
KevinLam replied

03-30-2010, 11:16 PM
Originally posted by westerman View Post

I have two clusters. One has 8 machines, 16 CPUs each, 128 GB memory each, all connected to a fast disk. However I can only run the command-line Bioscope on it. With that much machine power I do not worry about running out of memory.

My other cluster also has 8 machines. 4 with 4 CPUs and 8 GB memory each and the other 4 with 8 CPUs and 32 GB memory each. I have been trying to run WT-bioscope on these machines but with less success. I am running out of memory plus sometimes getting kernel warnings. My current parameters are:

mapping.np.per.node=4
mapping.number.of.nodes=10
mapping.memory.size=3

In other words 4 CPUs per node and 10 nodes (I am making my 8-cpu machines into 2 nodes each so, in theory, I should have 12 nodes but I wanted to leave some processing power free).

The memory parameter is 3 GB but I am unsure what this really means. Does bioscope start up 4-cpu jobs on a node using only 3 GB? Or does bioscope start up 4 1-cpu jobs on a node using 3 times 4 GB of memory? It appears to do the latter since my 8 GB machines have to use virtual memory at times.

I really hesitate to go below 3 GB since my genome reference size is ~2 GBases. As far as I can tell Bioscope is chopping up the matching portion of its pipeline into many small chunks in order to accommodate this small memory allocation.

Anyway I would say that the more memory you have then the better off you are. It makes sense to run fewer jobs with lots of memory than many jobs each starved for memory.

Once I get Bioscope running on my small cluster using all 8 machines then I will try it out on the small cluster using just the 4 large memory machines. Our small cluster is sort of a 'recycled' cluster (e.g. some of the machines were given to us) and we would like to use it if possible. I hate to think that a 4-cpu, 8-GB machine is just so much junk and thus we should re-gift it but, for Bioscope at least, it appears that those machines may indeed be worthless.

Apparently the min req is not just 2 GB per core but at least 16 GB ram per node and 24 GB ram is recom for human mapping.
I have been wrestling with ABI to try to make it work but they are less responsive when told I am working with 8 GB machines.
Just trying to map mouse transcriptome reads at this time and so far the 'big' memory jobs complete.
its the small 2 Gb jobs that fail possibly cos of temporary network glitches which bioscope isn't written to handle and I was advised to restart the job

Do drop me a pm or a reply here if u get the 8 GB machines working.
else i think they would be good enough for BWA or bowtie mapping.
Leave a comment:
westerman replied

03-30-2010, 07:23 AM
I have two clusters. One has 8 machines, 16 CPUs each, 128 GB memory each, all connected to a fast disk. However I can only run the command-line Bioscope on it. With that much machine power I do not worry about running out of memory.

My other cluster also has 8 machines. 4 with 4 CPUs and 8 GB memory each and the other 4 with 8 CPUs and 32 GB memory each. I have been trying to run WT-bioscope on these machines but with less success. I am running out of memory plus sometimes getting kernel warnings. My current parameters are:

mapping.np.per.node=4
mapping.number.of.nodes=10
mapping.memory.size=3

In other words 4 CPUs per node and 10 nodes (I am making my 8-cpu machines into 2 nodes each so, in theory, I should have 12 nodes but I wanted to leave some processing power free).

The memory parameter is 3 GB but I am unsure what this really means. Does bioscope start up 4-cpu jobs on a node using only 3 GB? Or does bioscope start up 4 1-cpu jobs on a node using 3 times 4 GB of memory? It appears to do the latter since my 8 GB machines have to use virtual memory at times.

I really hesitate to go below 3 GB since my genome reference size is ~2 GBases. As far as I can tell Bioscope is chopping up the matching portion of its pipeline into many small chunks in order to accommodate this small memory allocation.

Anyway I would say that the more memory you have then the better off you are. It makes sense to run fewer jobs with lots of memory than many jobs each starved for memory.

Once I get Bioscope running on my small cluster using all 8 machines then I will try it out on the small cluster using just the 4 large memory machines. Our small cluster is sort of a 'recycled' cluster (e.g. some of the machines were given to us) and we would like to use it if possible. I hate to think that a 4-cpu, 8-GB machine is just so much junk and thus we should re-gift it but, for Bioscope at least, it appears that those machines may indeed be worthless.
Leave a comment:
cchu replied

03-26-2010, 12:20 PM
Originally posted by westerman View Post

Yes. I can run fragment to diBayes calls, pairing to diBayes calls and most recently the whole transcriptome calling. All command line. It has been a bear to get running smoothly since the assumptions that the various pipelines run under seem to be different.

How many nodes, processors and memory you have for the offline cluster? We installed bioscope on the cluster and got it run using the example data but failed when we moved to test our own data with whole transcriptome pipeline. Dose it need a lot of memory to run?
Leave a comment:
clariet replied

03-25-2010, 06:55 AM
Just saw this post. We were able to use a Whole-transcriptome pipeline of BioScope (1.0.1-42) on a RNA-seq dataset. And a note about its mapping statistics. I confirmed with their specialists that the current version of BS has bug on those numbers. so it will be fixed in next release, hopefully very soon.

We have a feeling that a large proportion of reads are wasted for SOLiD data compared to Solexa. For example, for a current chip-seq dataset, we have seen a average of 80M reads generated for a sample (quad). However, after filtering of low quality alignment and non-unique hits, only ~4% of reads could be used for further peak detection. Has anyone have similar experience? Does this sound normal?
Leave a comment:
jaldrich replied

03-23-2010, 12:18 PM
That is great news that you were able to get it running on your own data. Thanks for the quick post. I am in the process of getting up running on our data too. Just wanted to know if someone has been successful.
Leave a comment:
westerman replied

03-23-2010, 12:12 PM
Yes. I can run fragment to diBayes calls, pairing to diBayes calls and most recently the whole transcriptome calling. All command line. It has been a bear to get running smoothly since the assumptions that the various pipelines run under seem to be different.
Leave a comment:
jaldrich replied

03-23-2010, 08:53 AM
Sorry a little late to this discussion, but has anyone been able to get Bioscope to run on their own data? (that is the CL version BioScope-1.0.1-42)
Leave a comment:
westerman replied

02-01-2010, 07:17 AM
I think that ABI/LifeTech are still just releasing the bioscope software on an 'as-need' basis until such time as they have it in releasable form. The last public link that have at the web site is for SAET. I do not see a public mention of bioscope.
Leave a comment:
skblazer replied

02-01-2010, 07:10 AM
Originally posted by drio View Post

http://solidsoftwaretools.com

I am sorry, but I cannot find Bioscope here.
Leave a comment:
drio replied

02-01-2010, 06:51 AM
Originally posted by skblazer View Post

Excuse me, where can I download Bioscope

Solid Software: Situs Berita Teknologi Terunggul Anda

http://solidsoftwaretools.com

Solid Software adalah situs berita teknologi, program dan software dengan visi dan misi memajukan industri teknologi untuk warga Indonesia.
Leave a comment:
skblazer replied

01-31-2010, 10:25 PM
Originally posted by drio View Post

I have been using it for re-sequencing (still testing). BS bundles a bunch of different experiment, WT among them. I would say, download the software and start by
running the examples that come with it. Once you have it up and running modify it
to work with your data and test it out.
ABi supports SGE and PBS. THe installation in non-root mode is not extremely invasive so I would suggest you start by that.

Let us know how it goes.

Excuse me, where can I download Bioscope
Leave a comment:

Previous 1 2 3 template Next

Investigating the Gut Microbiome Through Diet and Spatial Biology

by seqadmin

The human gut contains trillions of microorganisms that impact digestion, immune functions, and overall health¹. Despite major breakthroughs, we’re only beginning to understand the full extent of the microbiome’s influence on health and disease. Advances in next-generation sequencing and spatial biology have opened new windows into this complex environment, yet many questions remain. This article highlights two recent studies exploring how diet influences microbial...
- Channel: Articles
02-24-2025, 06:31 AM

Topics	Statistics	Last Post
TIGR Systems Offer a Compact Alternative to CRISPR for Gene Editing by seqadmin Started by seqadmin, 03-03-2025, 01:15 PM	0 responses 162 views 0 likes	Last Post by seqadmin 03-03-2025, 01:15 PM
Highlights from AGBT 2025 – Part II by seqadmin Started by seqadmin, 02-28-2025, 12:58 PM	0 responses 251 views 0 likes	Last Post by seqadmin 02-28-2025, 12:58 PM
Highlights from AGBT 2025 – Part I by seqadmin Started by seqadmin, 02-24-2025, 02:48 PM	0 responses 625 views 0 likes	Last Post by seqadmin 02-24-2025, 02:48 PM
Selecting the Right AI Model for Bioinformatics Research by seqadmin Started by seqadmin, 02-21-2025, 02:46 PM	0 responses 265 views 0 likes	Last Post by seqadmin 02-21-2025, 02:46 PM

Seqanswers Leaderboard Ad

Announcement

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Leave a comment:

Latest Articles

ad_right_rmr

News