Seqanswers Leaderboard Ad

**westerman** · 09-23-2009, 07:43 AM

You can run the corona-lite generated scripts without a queuing system. I have done it many times. If the program says it can't find your 'scratch folder' ... well, that means you do not have a scratch folder. You can assign one via the '--scratch=' command line parameter. Type in '--help' to see the possible parameters.

**nedoluzhko** · 09-23-2009, 11:17 PM

Thank to you! I should to add '--scratch=' command to pairing_by_group.pl is not it? I think I have another problem - I started pairing_by_group.pl and Corona Lite requested some variables such as f3_length, r3_length (see attach file). In manual I do not find it variables in pairing_by_group.pl script...

Attached Files

bug_23.09.09.jpg (91.6 KB, 32 views)

**westerman** · 09-24-2009, 06:07 AM

The f3_length, etc. parameters are optional. The message you see is merely informational. The manual does not explain them but if you do a 'pairing_by_group.pl --help' then all of the parameters are explained.

One example:

-f3l, --f3_length <arg> Match length of F3 tag. Use if match length is different than the tag length

I do note that you are getting a lot of permission errors. You should correct these.

**nedoluzhko** · 09-28-2009, 01:27 AM

Thank you very much! Please, may me is there big manuscript about Corona Lite i don't understand more in this program because I am beginner in bioinformatics... This pdf (http://solidsoftwaretools.com/gf/dow...ion_v4.2.1.pdf) don't give a lot information for me

**westerman** · 09-28-2009, 07:36 AM

Corona Lite documentation is in a variety of places and it is often frustrating to find everything. The pdf that you quotes is the manual for SNP discovery. There is also a 42-page guide titled "SOLiD Data Analysis Pipeline". I am not sure where I got my copy -- perhaps from the ABI instead of solidsoftwaretools? -- but it is a useful resource. If I get the chance I will look for it.

**nedoluzhko** · 09-30-2009, 04:41 AM

Dear Westerman! Many thanks for you! Please give answers for my several questions. I have statistics after mapping and I don't understand some terms. Is the beads = reads or not? Please see on attach file: what is "number 1"? Is the "number 2" - divergence from reference sequence - SNPs or errors of sequencing? Is the 83 % reference genome do not covered (see number - 5)... What is number 3 and 4

?

Attached Files

questions.jpg (88.4 KB, 31 views)

**westerman** · 09-30-2009, 11:19 AM

#1) refers to how the beads were mapped. 19.7% of them had zero adjacent colorspace mismatches that where next to any other colorspace mismatch. These beads may have other mismatches but none of them are adjacent to each other. Or the beads may not have had any mismatches. #2 below refers to this more detail.

0.51% of the beads had mismatches that were adjacent to other mismatches however these mismatches are considered to be 'valid' (and thus probably SNPs) -- as you may know of the 16 possible dual mismatch combinations only 4 are considered to be valid transitions.

#2) This sort of repeats #1 but in more detail. We can tell there were 16.7M beads with an 'error' (e.g., a mismatch) and 14.7M of these had a single mismatch. By inference we can then say of the 27.4M beads with zero adjacent mismatches (from #1) 10.7M of them had zero mismatches. (27.4M minus 14.7M).

We then look at the beads with adjacent errors and see that 701K of them had valid adjacent errors (mismatches) -- these are likely to be be SNPs -- while 286K of them had invalid errors; these latter beads are considered 'errors of sequencing' and are discarded and not used.

---> To actually see the SNPs you need to continue the pipeline and run the SNP calling portion. The above only gives you a quick indication of how the beads are mapping.

#3 and #4) There were 26M points (bases) on your reference where the first 'base' of a bead was placed (or mapped). On the average 2.46 beads were placed (mapped) to each of these 26M points.

--> It looks like you have a ~2 GBase reference sequence and 139M beads. Since you have a much larger reference than the number of beads, in an ideal situation each bead should be able to be placed down in its own unique point (base). Of course real life is never that ideal but, still, it seems like you have managed to amplify your starting material in such a way that too few points (bases) are covered and that those which are covered have too many beads covering them. The sequencing itself seems to be reasonable (46% of the beads matching is a bit low but I have seen even lower) and the number of sequencing errors is very low.

Topics	Statistics	Last Post
Cancer Metastasis: A Deep Dive into Cellular Plasticity by seqadmin Started by seqadmin, 04-11-2024, 12:08 PM	0 responses 59 views 0 likes	Last Post by seqadmin 04-11-2024, 12:08 PM
Proteogenomic Profiles Offer New Clues in Prostate Cancer by seqadmin Started by seqadmin, 04-10-2024, 10:19 PM	0 responses 57 views 0 likes	Last Post by seqadmin 04-10-2024, 10:19 PM
Novel Diagnostic Assay Enhances Ovarian Cancer Detection by seqadmin Started by seqadmin, 04-10-2024, 09:21 AM	0 responses 51 views 0 likes	Last Post by seqadmin 04-10-2024, 09:21 AM
Evolutionary Dynamics of Centromeres: A Comparative Genomic Analysis by seqadmin Started by seqadmin, 04-04-2024, 09:00 AM	0 responses 55 views 0 likes	Last Post by seqadmin 04-04-2024, 09:00 AM

Seqanswers Leaderboard Ad

Announcement

Corona Lite 4.0 Pairing Pipeline problem

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Latest Articles

ad_right_rmr

News