SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Vendor Forum



Similar Threads
Thread Thread Starter Forum Replies Last Post
Strelka: Somatic small-variant calling workflow for matched tumor-normal samples ctsa Bioinformatics 15 12-15-2014 01:38 AM
Looking for tools for calling snv and indel(Somatic Variant) fabrice Bioinformatics 1 06-13-2014 10:29 AM
Tolerances for non cells-of-interest for somatic variant calling rjohnp Sample Prep / Library Generation 0 04-04-2013 08:16 AM
RTG Investigator 2.4.1: Somatic/pedigree calling, Improved Ion Torrent, lots more++ Stuart Inglis Vendor Forum 0 03-28-2012 06:31 PM
RTG Investigator variant detection white paper Stewart Noyce Vendor Forum 0 06-30-2011 08:27 AM

Reply
 
Thread Tools
Old 07-16-2015, 01:39 PM   #1
Len Trigg
Registered Vendor
 
Location: New Zealand

Join Date: Jun 2011
Posts: 29
Default RTG 3.5: Somatic calling / metagenomics / variant comparison / BSD Licensing

Real Time Genomics are pleased to announce the availability of new releases of our full analysis suite, RTG Core (commercial / free for non-commercial use), and our utility package, RTG Tools (free for any use). This release includes new features and performance improvements. Some of the highlights of this release:

* Several improvements to somatic variant calling, including the ability to specify site-specific somatic priors, control of output for gain-of-reference and loss-of-heterozygosity events, and changes to the VCF to align with TCGA VCF specification.

* Improvements to metagenomic species reference database management. Several new options allow better customization of a species reference, and extraction of genomic information for individual species contained within the reference database.

* Improvements to our sophisticated variant comparison tool vcfeval, primarily the ability to perform evaluation restricted to individual regions or sets of regions (for example GiaB high-confidence intervals or exome target regions), and the inclusion of more accuracy metrics, both as a new summary file and included in the weighted ROC data file.

* We are also pleased to make the source code to RTG Tools available under the Simplified BSD License, on github. (Source code for RTG Core remains available for non-commercial use).

* Many other minor improvements (full release notes for this version are detailed below.)

If you haven't used RTG Core before (or maybe even if you have), it includes a nice new demo script that runs through an end-to-end demonstration of sex-aware and pedigree-aware family variant calling, including de novo variant detection and variant evaluation with vcfeval. (It also makes a nice demo of our comprehensive simulation tools.)

Commercial users of RTG Core may download the update from our website at http://realtimegenomics.com/products/rtg-core-downloads. Non-commercial users can download the update from our website at http://realtimegenomics.com/products...non-commercial or build from the source on github (note the updated build instructions).

Users of RTG Tools, which is made freely available for non-commercial or commercial use alike, can download the new version from our website at http://realtimegenomics.com/products/rtg-tools or build from the source code on github.


Detailed changes are listed below by area. Please read these through fully, as some command-line flags have changed, so updates to your pipeline scripts may be required. For more information on new features, see the RTG Operations Manual.

RTG Core 3.5 (2015-07-16)
-------------------------

### Basic Formatting and Mapping

* format/map: When formatting or mapping reads supplied as SAM/BAM
input data, any alignments marked as supplementary are ignored.
Note that if the input data has already been aligned, it is
recommended that the BAM file be shuffled to avoid biases during
mapping arising from the data being presented in chromosomal
order. See the user manual for more information.

* sdf2fasta/sdf2fastq: These commands have new flags --names and
--id-file that operate the same as their counterpart in sdfsubset.

* sdfsubset: This command has new flags --start-id and --end-id that
allow specifying a range of sequences by ID.

* sdf2sam: This new command to allows the extraction of reads from SDF
in the form of unaligned SAM/BAM. This has a benefit over
extraction as FASTQ in that some metadata (such as read group
information) is preserved, paired end data is stored in a single
file, and quality encoding is inherent in the format.

* chrstats: Reduce false positives in sex inconsistency detection that
were due to applying the (tighter) sex-chromosome threshold also to
autosomes. This threshold is now applied to sex-chromosomes only.

### Variant Calling and Analysis

* somatic: Now allows the user to specify a BED file containing
per-site somatic priors, which can be used (for example) to reduce
the somatic prior at sites typical of false positives (e.g. presence
in dbSNP) or increase the somatic prior at sites known to harbour
somatic variants (e.g. presence in COSMIC). For more information
see the user manual.

* somatic: At the end of variant calling, the somatic caller produces
an estimate of somatic sample contamination. Previously this
estimate was only available in the log file, but in this release
this computation has been greatly improved, and the contamination
estimate is now included in the standard summary statistics.

* somatic: "Gain of reference" calls are now disabled by default.
These can be included by specifying the new flag
--include-gain-of-reference.

* somatic: Calls that are indicative of loss of heterozygosity (LOH)
calls are not produced by default (since loss of heterozygosity
analysis is most useful in conjunction with additional data such as
germline variant calls or CNV data). These calls can be produced if
desired by specifying --loh with a prior greater than 0).

* somatic: When LOH calls are enabled, previously they were output in
haploid GT representation, now they use the ploidy appropriate for
the chromosome (according to the reference), for compatibility with
downstream processing tools.

* somatic: VCF output changes to bring the somatic representation in
line with TCGA 1.2 VCF specification. In particular:

* Calls include a new FORMAT field SS that indicates the somatic
status for the derived (tumor) sample. This field replaces the
previous SOMATIC INFO field.

* Calls include a new FORMAT field SSC which contains the somatic
score for the derived (tumor) sample. This field replaces the
previous RSS INFO field.

* lineage: Supports the input of pedigree in the form of VCF header
annotations as output by the somatic caller, in the form:

##PEDIGREE=<Derived=TUMORSAMPLENAME,Original=NORMALSAMPLENAME>

* population: Fixed a rare case where sometimes after complex call
simplification, the only sample genotype containing a non-ref allele
was a member of the pedigree not being output, and in this case the
QUAL score was the 10log10 prob(no variant) rather than 10log10
prob(variant) as required by the VCF specification. This has been
addressed.

* vcfmerge: Added a new flag --force-merge-all to always attempt to
merge headers containing conflicting descriptions.

* vcfmerge: Previously vcfmerge would not process records containing
symbolic alleles. These are now accepted.

* vcfmerge: More graceful handling when encountering records with a GT
that refers to a non-existent ALT.

* vcfeval: Now outputs a summary containing various accuracy
metrics. A first set of statistics is computed from the full set of
variants evaluated (these will typically have highest sensitivity
but potentially poor precision if the input call set has not been
filtered). A second set of statistics is computed based on the ROC
curve information, selected at a threshold which maximises the
F-measure statistic (this provides some balance between sensitivity
and precision, so may be a fairer point to gather statistics for
cross-caller comparison).

* vcfeval: The weighted_roc.tsv file now includes columns containing
additional accuracy metrics.

* vcfeval: Improved the detection that alerts the user when chromosome
names are incompatible between reference, baseline, calls, and bed
regions (if used). Improvements to other error and warning messages.

* vcfeval: Added a new flag --bed-regions to supply a BED file
containing a list of regions that the VCF records must overlap with
in order to be included in analysis. For example, a common use case
is to restrict to only evaluating calls contained within the GIAB
high-confidence regions, or only within regions corresponding to
exome target regions.

* vcfeval: Added a new flag --region to specify a single region to
evaluate variants within. This is useful when evaluating calls on a
single chromosome or within a small region of interest.

* vcfeval: Fixed a case where a ref-only call (i.e. containing no
alts) could get output instead of an indel with a padding base at
the same position.

* vcfeval: Disabled the output of slope analysis data files by default,
as these are fairly special purpose (primary ROC files are still
output). They can be re-enabled if desired by using the new
expert/experimental flag --Xslope-files.

* vcffilter: The --remove-all-same-as-ref flag now does not consider a
sample with missing GT as being variant, since the intent of this
flag is to retain only records where at least one sample is called
as variant.

* vcfannotate: Added two new flags --info-id and --info-description to
allow specifying the name of the INFO ID and Description fields
added to the header during annotation. These flags only take effect
if the VCF header does not already contain an INFO declaration with
that ID.

### Metagenomics

* taxfilter: Added a new flag --subtree which allows selecting entire
taxonomic subtrees for inclusion in the output taxonomy.

* taxfilter: Added a new flag --remove-sequences to allow the removal
of sequence data associated with specific taxon ids.

* sdf2fasta: Added a new flag --taxons to allow interpreting any
supplied ID as a taxon ID and all sequences assigned to such taxon
ID will be output. This provides an easy way to extract genomic
sequence for any species from the reference SDF.

### Other

* genomesim: Added a new flag --prefix to specify a prefix for
generated sequence names.

* many: Update the base library used for SAM/BAM input and output to
htsjdk 1.128.

* many: VCF reading now detects cases where a header specifies a field
declaration using an ID that is already in use, preventing duplicate
header declarations.

* extract: Fix a regression where extracting from VCF without any
region specified would include the VCF header.
__________________
Len Trigg, Ph.D.
Real Time Genomics
www.realtimegenomics.com
Len Trigg is offline   Reply With Quote
Old 09-07-2015, 02:53 PM   #2
Len Trigg
Registered Vendor
 
Location: New Zealand

Join Date: Jun 2011
Posts: 29
Default

New stable releases are now available which include minor improvements and bug fixes.

The first of these is our full analysis suite, RTG Core 3.5.1. The changes in this version are listed below. Commercial users may download the update from our website at http://realtimegenomics.com/products/rtg-core-downloads. Non-commercial users can download the update from our website at http://realtimegenomics.com/products...non-commercial or build from the updated source code on github at https://github.com/RealTimeGenomics/rtg-core.

We have also produced updated builds of our utilities package, RTG Tools 3.5.1, which is made freely available for non-commercial or commercial use alike. More information and download links are available from our website at http://realtimegenomics.com/products/rtg-tools or build from the updated source on github at https://github.com/RealTimeGenomics/rtg-tools.


RTG Core 3.5.1 (2015-09-07)
---------------------------

This release primarily includes bugfixes and minor improvements:

* coverage: Fix an exception that could occur if running with a
reference SDF supplied that had chromosomes in a different order
compared to the BAM sequence dictionary (typically this could occur
when running coverage on third-party BAMs)

* extract: When extracting multiple regions these regions are now
sorted.

* vcfeval: When an entire chromosome contained only baseline or only
called variants, the summary statistics for FP/FN were not being
incremented correctly.

* vcfeval: Fixed a case where path-finding could get confused and drop
variants.

* vcfeval: Speed improvement in post-processing.

* many: Improved error reporting for commands that involve processing
multiple BAM files, so that the name of the particular file causing
the problem is included.

* wrapper: Fixed the java version number check so that it works
correctly with openjdk 1.8
__________________
Len Trigg, Ph.D.
Real Time Genomics
www.realtimegenomics.com
Len Trigg is offline   Reply With Quote
Old 10-15-2015, 05:33 PM   #3
Len Trigg
Registered Vendor
 
Location: New Zealand

Join Date: Jun 2011
Posts: 29
Default

New stable releases are now available which include minor improvements and bug fixes.

The first of these is our full analysis suite, RTG Core 3.5.2. The changes in this version are listed below. Commercial users may download the update from our website at http://realtimegenomics.com/products/rtg-core-downloads. Non-commercial users can download the update from our website at http://realtimegenomics.com/products...non-commercial or build from the updated source code on github at https://github.com/RealTimeGenomics/rtg-core.

We have also produced updated builds of our utilities package, RTG Tools 3.5.2, which is made freely available for non-commercial or commercial use alike. More information and download links are available from our website at http://realtimegenomics.com/products/rtg-tools or build from the updated source on github at https://github.com/RealTimeGenomics/rtg-tools.


RTG Core 3.5.2 (2015-10-15)
---------------------------

This release primarily includes bugfixes and minor improvements:

* many: When piping results from one command to another, and a later
command closes the pipe (e.g. head), this scenario no longer
produces an "Broken pipe" error message. This is consistent with the
behaviour of commonly used command-line tools.

* rocplot: Updated to handle ROC data files that contain lines with
non-numeric score field. (In particular, future versions of vcfeval
will include additional data-points corresponding to variants with
no score provided)

* rocplot: (GUI) Improvement to usability for curve renaming. Now a
single-click in the curve title area enters edit mode, with
RETURN/TAB to accept, ESC to cancel.

* rocplot: (GUI) Add a button that prints an equivalent command line
to the terminal, for easy restarting with similar state,
particularly if curves files have been added interactively..

* cgmap: Fix for sample sex being ignored when supplied via a pedigree
file rather than using explicit sex flag.

* misc: Removed vestigial (and in RTG Tools' case, incorrect)
"Licensed to:" line from the version command output.

* misc: Add BSD license text to the RTG Tools distributable zip.
__________________
Len Trigg, Ph.D.
Real Time Genomics
www.realtimegenomics.com
Len Trigg is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 01:00 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2019, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO