SEQanswers

Go Back   SEQanswers > Sequencing Technologies/Companies > Vendor Forum



Similar Threads
Thread Thread Starter Forum Replies Last Post
SMRT Tools, Parsnp, CLG GWB - NGS Tools System requirements weehzer Bioinformatics 0 03-30-2017 04:59 AM
RTG Core 3.4.1 / RTG Tools 3.4.1 Len Trigg Vendor Forum 4 05-27-2015 03:23 PM
Free Ebook on Bioinformatics Tools - Feel Free to Grab It samanta Bioinformatics 0 05-20-2015 06:41 PM
PubMed: The Ribosomal Database Project: improved alignments and new tools for rRNA an Newsbot! Literature Watch 0 11-14-2008 06:00 AM

Reply
 
Thread Tools
Old 05-15-2017, 09:38 PM   #1
Len Trigg
Registered Vendor
 
Location: New Zealand

Join Date: Jun 2011
Posts: 29
Default RTG 3.8: New QC tools / improved machine learning / free simulation tools

Real Time Genomics are pleased to announce the availability of new releases of our full analysis suite, RTG Core, and our utility package, RTG Tools. This release includes new features and performance improvements. Some of the highlights of this release:

* Improvements aimed at preprocessing and QC. In particular, RTG includes two new commands, fastqtrim and petrim, for preprocessing FASTQ files to apply various kinds of trimming before entering the NGS pipeline. These commands greatly expand what was previously available during data formatting.

* The suite of simulation commands that were previously only available as part of RTG Core have been included in the RTG Tools package. These commands encompass simulation of reference genomes (genomesim), simulation of population-level variants (popsim), individual sample genomes using population variants (samplesim), simulation of samples as member of a pedigree obeying inheritance rules (childsim), simulation of de-novo variants (denovosom), generation of a genome given a VCF of sample variants (samplereplay), and read simulation according to a range of sequencer parameters (readsim/cgsim).

* Initial support for accepting CRAM files as input to variant calling commands and most other commands that accept alignments as input. For some commands this may now require specifying a reference SDF in order to decode the CRAM files.

* Improvements to the prebuilt AVR models that perform variant scoring. These models have been rebuilt using training data incorporating the latest truth sets produced by the GIAB initiative as well as improvements to the underlying machine learning algorithms.

* User manual improvements, in particular the baseline progressions section has been rearranged to better illustrate how to run end-to-end RTG calling pipelines that make best use of RTG features such as sex-aware and pedigree-aware variant calling.

If you haven't used RTG Core before (or maybe even if you have), we suggest you run the demo-family.sh script that runs through a short end-to-end demonstration of sex-aware and pedigree-aware family variant calling, including de novo variant detection and variant evaluation with vcfeval. (It also makes a nice demo of our comprehensive simulation tools.)

Commercial users of RTG Core may download the update from our website at http://realtimegenomics.com/products/rtg-core-downloads. Non-commercial users can download the update from our website at http://realtimegenomics.com/products...non-commercial or build from the source on github at https://github.com/RealTimeGenomics/rtg-core.

Users of RTG Tools, which is made freely available for non-commercial or commercial use alike, can download the new version from our website at http://realtimegenomics.com/products/rtg-tools or build from the source code on github at https://github.com/RealTimeGenomics/rtg-tools.


Detailed changes are listed below by area. For more information on new features, see the RTG Operations Manual which is included within the distribution as HTML and PDF.

## Basic Formatting and Mapping

* fastqtrim: This new command allows trimming of FASTQ files with much
more flexibility and control than is available directly from
format. See the user manual for more information and examples.

* petrim: This new command allows trimming of read bases in paired-end
data where read-through has occurred, as determined by alignment
overlap. See the user manual for more information and examples.

* format: Support for reading interleaved paired-end FASTQ added. This
is useful for formatting directly from streamed output of the petrim
command, avoiding additional disk I/O.

* format/map: The quality encoding for FASTQ input files now defaults to
the sanger encoding used by the majority of modern FASTQ files, and so
the --quality-format flag typically only needs to be specified when
processing older FASTQ files employing an alternative encoding.

* many: When outputting FASTA/FASTQ, ensure consistent use of unix line
endings across the various commands.

* calibrate: When calibrating multiple BAM files, each is calibrated in
an independent thread, obeying --threads flag.

* sammerge: New flag --subsample that permits a fraction of the
alignments through to the output. In addition, the new flag --seed
lets you control which seed is used for this filtering.

* coverage: Computes additional QC metrics fold-80 penalty and median
coverage.

* coverage: New flag --per-region to which changes how BED/BEDGRAPH
coverage records are triggered, from being whenever the coverage level
changes, to only when the region changes.

* sammerge: Will now create output files in CRAM format if the output
filename ends with ".cram". This requires the user to specify the
reference SDF via the new --template flag.

* index: Now allows creating indexes for CRAM files. These are the
`.bai` indexes currently supported by htsjdk, rather than `.crai`
indexes.

### Variant Calling

* snp: Includes INFO.DP annotations in output VCF, for consistency with
the existing multi-sample caller output.

* family/population/somatic: New VCF annotations (OCOC/OCOF/DCOC/DCOF)
that indicate the count/fraction of contrary evidence observed in the
original(parent) vs derived(child) samples.

* snp/family/population/somatic: These commands now support SAM/BAM
files that make use of the '=' character in the SEQ field (such as can
be created by BamUtil:convert)

* snp/family/population/somatic: These commands now support CRAM files
as input.

* family/population: Improved error reporting for semantically incorrect
user-supplied pedigree information.

* snp/family/population/somatic: Improvements to the accuracy of the
pre-built AVR models. These models have been rebuilt using training
data incorporating the latest truth sets produced by the GIAB
initiative as well as improvements to the underlying machine learning
algorithm.

* snp/family/population: The default AVR model is now illumina-wgs.avr
(previously the default was illumina-exome.avr). For exome calling,
the illumina-exome.avr model provides an advantage over
illumina-wgs.avr only when the primary interest is maximising the
scoring of variants called outside of exome target regions.

* many: For compatibility with non-human species, sex handling of PAR
regions has been extended to allow the length of a PAR region in each
member of an allosome pair to be of different length.

* svprep: Add the ability to run on merged alignment files rather than
requiring alignment files to be separated into mated vs unmated vs
unmapped.

* svprep: New flag --no-augment flag permits the computation of read
group statistics files only, for use when collecting statistics from
third party alignment files.

* avrpredict: New flag --sample to allow AVR scoring of only the
specified sample names.

* avrpredict: New flag --vcf-score-field to allow storing the AVR score
into a format field with a different name, useful when comparing
multiple scoring models.

* avrbuild: Improvements to the quality of models built in the presence
of missing annotations.

### Variant Processing and Analysis

* vcfmerge: When combining records at the same position, vcfmerge will
now not combine records at a site where some records use a VCF padding
base (as required by the VCF specification to prevent REF or ALT being
zero-length) and some records do not. This is because a record which
utilizes a padding base is not making an assertion about the genotype
of the padding base itself, and merging these records loses this
semantic distinction. (The old behaviour can be obtained via
--Xnon-padding-aware.)

* vcfannotate: New flag --no-header to suppress output of the VCF header.

* vcfsubset: New flag --remove-ids to allow clearing the ID column.

* rocplot: New flag --zoom which allows the specification of an initial
zoom to display. See the user manual for a description of the
coordinate syntax.

* rocplot: (GUI) Add ability to remove a curve via per-curve pop-up menu
in the side-pane.

* rocplot: (GUI) Prevent loading the same ROC data file multiple times,
and improve error handling on invalid files.

* rocplot: (GUI) Improvements to the open file dialog. Now defaults to
displaying ROC data files only, permits opening multiple ROC data
files at once via multi-select, and other minor changes.

* rocplot: (GUI) The "Cmd" button now shows the command in a pop-up
dialog rather than sending it to the terminal, which eliminates the
need to search through multiple tmux windows to find where rocplot was
started from.

* many: Invalid VCF header contig length specifications are now reported
gracefully.

* many: Improved error reporting of general VCF header parsing errors,
now include the problematic line where possible.

* many: Improved error reporting of malformed GT fields.

### Metagenomics

* species: Fix the handling of mappings that contain non-unique
read-names (as could arise when mapping directly from FASTQ files as
separate mapping runs and passing the resulting alignments to
species).

* species: Accuracy improvements when using paired-end data as the
underlying data source.

### Other

* pedstats: Improved the GraphViz pedigree visualization layout for
normal pedigree structures. The old layout is available with the new
``--simple-dot`` flag.

* many: The following simulation commands are now included as part of
RTG Tools: genomesim, cgsim, readsim, popsim, samplesim, childsim,
denovosim, samplereplay.

* readsim: When using --taxonomy-distribution and --distribution, one of
--abundance or --dna-fraction must be supplied in order to indicate
the desired interpretation.

* index: the -f flag is now optional and by default index will attempt to
determine the file format by the extension.

* many: Most commands accept the advanced flag --Xforce that allows them
to continue in the case of pre-existing output files or
directories. Be aware that particularly in the case of output
directories the final directory contents may include files from
previous runs (or even other commands), so this option should not be
used in production scenarios.

* many: Fixed an exception that could occur when performing multiple
region based querying of SAM/BED/VCF records, where the regions were
densely packed near the ends of chromosomes.

* many: Almost all commands that take SAM/BAM as input now support CRAM
files as input. Some of these commands have a new flag used to supply
the reference SDF which is required when decoding CRAM.

* misc: The rtg bash command completion has been improved to be more
portable and no longer caches completion data on disk.

* many: Linux and Windows packages have updated the bundled JRE to the
latest from Oracle.
__________________
Len Trigg, Ph.D.
Real Time Genomics
www.realtimegenomics.com
Len Trigg is offline   Reply With Quote
Old 06-01-2017, 07:59 PM   #2
Len Trigg
Registered Vendor
 
Location: New Zealand

Join Date: Jun 2011
Posts: 29
Default

New stable releases are now available which include minor improvements and bug fixes.

The first of these is our full analysis suite, RTG Core 3.8.1. The changes in this version are listed below. Commercial users may download the update from our website at http://realtimegenomics.com/products/rtg-core-downloads. Non-commercial users can download the update from our website at http://realtimegenomics.com/products...non-commercial or build from the updated source code on github at https://github.com/RealTimeGenomics/rtg-core.

We have also produced updated builds of our utilities package, RTG Tools 3.8.1, which is made freely available for non-commercial or commercial use alike. More information and download links are available from our website at http://realtimegenomics.com/products/rtg-tools or build from the updated source on github at https://github.com/RealTimeGenomics/rtg-tools.


RTG Core 3.8.1 (2017-05-29)
---------------------------

This release primarily includes bugfixes and minor improvements:

* rocplot: (GUI) The right hand panel now includes a visual indication
of the color for each curve.

* rocplot: (GUI) The color for a curve can now be set via color picker
available from the per-curve context menu.

* rocplot: (GUI) Reordering the curves is now achieved by drag and drop
rather than the (now removed) reorder buttons.

* misc: The RTG Tools release includes a scripts/demo-tools.sh that
gives a quick end-to-end demonstration of simulation and VCF
manipulation commands. This is similar in nature to the
scripts/demo-family.sh script that is included in RTG Core.

* vcfeval: Fix an exception caused by the skipping of heterozygous
structural variants being dependent on the GT field allele
ordering. These variants are now correctly skipped. In previous
releases the cases that slipped through would enter matching with a
stub allele representing the SV allele.

* vcfeval: When running a sample-free comparison via the option
`--sample ALT`, ignore records/alleles corresponding to structural
variants. In 3.8 these could produce an exception, and in previous
releases any SV alleles present were included as a generic token
during matching.

* vcfeval: Improve the handling of non-user exceptions encountered
during VCF loading. Previously these would produce an often
inscrutable message.

* version: Update copyright year and include an alternative citation
more appropriate for those using RTG Tools.

* popsim: Now includes the random number seed in the VCF header for
consistency with with other simulation commands.
__________________
Len Trigg, Ph.D.
Real Time Genomics
www.realtimegenomics.com
Len Trigg is offline   Reply With Quote
Old 06-19-2017, 07:35 PM   #3
Len Trigg
Registered Vendor
 
Location: New Zealand

Join Date: Jun 2011
Posts: 29
Default

New stable releases are now available which include minor improvements and bug fixes.

The first of these is our full analysis suite, RTG Core 3.8.2. The changes in this version are listed below. Commercial users may download the update from our website at http://realtimegenomics.com/products/rtg-core-downloads. Non-commercial users can download the update from our website at http://realtimegenomics.com/products...non-commercial or build from the updated source code on github at https://github.com/RealTimeGenomics/rtg-core.

We have also produced updated builds of our utilities package, RTG Tools 3.8.2, which is made freely available for non-commercial or commercial use alike. More information and download links are available from our website at http://realtimegenomics.com/products/rtg-tools or build from the updated source on github at https://github.com/RealTimeGenomics/rtg-tools.

RTG Core 3.8.2 (2017-06-20)
---------------------------

This release primarily includes bugfixes and minor improvements:

* vcfeval: Records where the REF/ALT contain bases not permitted by the
VCF specification are now skipped (and reported in the log) rather
than terminating execution.

* vcfeval: (`combine` and `ga4gh` output modes only) These modes were
inserting a redundant VCF header entry containing the command line,
which has been removed.

* vcfeval: GA4GH output mode now supports loose positional matching of
variants (within +/-30bp by default, and adjustable via
--Xloose-match-distance).

* many: Prevent number formatting issues in non-English locales. The
locale is now forced to US.

* many: Some commands were not appending gzip termination blocks to VCF
outputs, which could result in subsequent warning messages being
produced by some third party tools.

* many: Improve the consistency of exception handling in cases where the
exception is thrown in a worker thread.

* many: Attempting to supply file lists via shell process redirection
would fail in non-obvious ways. File lists from process redirection
are not currently supported and are now checked for up-front.

* minor: When setting up rtg bash tab completion, issue a warning if an
incompatible completion function has already been installed. (This can
happen on some linux distros if you have the system `bash-completion`
package installed and attempt to tab-complete rtg before installing
rtg bash completion.)

* minor: Fix a typo in the example configuration settings in rtg.cfg
(specifically, RTG_JAVA_OPTS was incorrectly listed as
RTG_JAVA_OPTIONS).
__________________
Len Trigg, Ph.D.
Real Time Genomics
www.realtimegenomics.com
Len Trigg is offline   Reply With Quote
Reply

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off




All times are GMT -8. The time now is 11:28 AM.


Powered by vBulletin® Version 3.8.9
Copyright ©2000 - 2018, vBulletin Solutions, Inc.
Single Sign On provided by vBSSO