Seqanswers Leaderboard Ad

Collapse

Announcement

Collapse
No announcement yet.
X
 
  • Filter
  • Time
  • Show
Clear All
new posts

  • RTG 3.5: Somatic calling / metagenomics / variant comparison / BSD Licensing

    Real Time Genomics are pleased to announce the availability of new releases of our full analysis suite, RTG Core (commercial / free for non-commercial use), and our utility package, RTG Tools (free for any use). This release includes new features and performance improvements. Some of the highlights of this release:

    * Several improvements to somatic variant calling, including the ability to specify site-specific somatic priors, control of output for gain-of-reference and loss-of-heterozygosity events, and changes to the VCF to align with TCGA VCF specification.

    * Improvements to metagenomic species reference database management. Several new options allow better customization of a species reference, and extraction of genomic information for individual species contained within the reference database.

    * Improvements to our sophisticated variant comparison tool vcfeval, primarily the ability to perform evaluation restricted to individual regions or sets of regions (for example GiaB high-confidence intervals or exome target regions), and the inclusion of more accuracy metrics, both as a new summary file and included in the weighted ROC data file.

    * We are also pleased to make the source code to RTG Tools available under the Simplified BSD License, on github. (Source code for RTG Core remains available for non-commercial use).

    * Many other minor improvements (full release notes for this version are detailed below.)

    If you haven't used RTG Core before (or maybe even if you have), it includes a nice new demo script that runs through an end-to-end demonstration of sex-aware and pedigree-aware family variant calling, including de novo variant detection and variant evaluation with vcfeval. (It also makes a nice demo of our comprehensive simulation tools.)

    Commercial users of RTG Core may download the update from our website at http://realtimegenomics.com/products/rtg-core-downloads. Non-commercial users can download the update from our website at http://realtimegenomics.com/products...non-commercial or build from the source on github (note the updated build instructions).

    Users of RTG Tools, which is made freely available for non-commercial or commercial use alike, can download the new version from our website at http://realtimegenomics.com/products/rtg-tools or build from the source code on github.


    Detailed changes are listed below by area. Please read these through fully, as some command-line flags have changed, so updates to your pipeline scripts may be required. For more information on new features, see the RTG Operations Manual.

    RTG Core 3.5 (2015-07-16)
    -------------------------

    ### Basic Formatting and Mapping

    * format/map: When formatting or mapping reads supplied as SAM/BAM
    input data, any alignments marked as supplementary are ignored.
    Note that if the input data has already been aligned, it is
    recommended that the BAM file be shuffled to avoid biases during
    mapping arising from the data being presented in chromosomal
    order. See the user manual for more information.

    * sdf2fasta/sdf2fastq: These commands have new flags --names and
    --id-file that operate the same as their counterpart in sdfsubset.

    * sdfsubset: This command has new flags --start-id and --end-id that
    allow specifying a range of sequences by ID.

    * sdf2sam: This new command to allows the extraction of reads from SDF
    in the form of unaligned SAM/BAM. This has a benefit over
    extraction as FASTQ in that some metadata (such as read group
    information) is preserved, paired end data is stored in a single
    file, and quality encoding is inherent in the format.

    * chrstats: Reduce false positives in sex inconsistency detection that
    were due to applying the (tighter) sex-chromosome threshold also to
    autosomes. This threshold is now applied to sex-chromosomes only.

    ### Variant Calling and Analysis

    * somatic: Now allows the user to specify a BED file containing
    per-site somatic priors, which can be used (for example) to reduce
    the somatic prior at sites typical of false positives (e.g. presence
    in dbSNP) or increase the somatic prior at sites known to harbour
    somatic variants (e.g. presence in COSMIC). For more information
    see the user manual.

    * somatic: At the end of variant calling, the somatic caller produces
    an estimate of somatic sample contamination. Previously this
    estimate was only available in the log file, but in this release
    this computation has been greatly improved, and the contamination
    estimate is now included in the standard summary statistics.

    * somatic: "Gain of reference" calls are now disabled by default.
    These can be included by specifying the new flag
    --include-gain-of-reference.

    * somatic: Calls that are indicative of loss of heterozygosity (LOH)
    calls are not produced by default (since loss of heterozygosity
    analysis is most useful in conjunction with additional data such as
    germline variant calls or CNV data). These calls can be produced if
    desired by specifying --loh with a prior greater than 0).

    * somatic: When LOH calls are enabled, previously they were output in
    haploid GT representation, now they use the ploidy appropriate for
    the chromosome (according to the reference), for compatibility with
    downstream processing tools.

    * somatic: VCF output changes to bring the somatic representation in
    line with TCGA 1.2 VCF specification. In particular:

    * Calls include a new FORMAT field SS that indicates the somatic
    status for the derived (tumor) sample. This field replaces the
    previous SOMATIC INFO field.

    * Calls include a new FORMAT field SSC which contains the somatic
    score for the derived (tumor) sample. This field replaces the
    previous RSS INFO field.

    * lineage: Supports the input of pedigree in the form of VCF header
    annotations as output by the somatic caller, in the form:

    ##PEDIGREE=<Derived=TUMORSAMPLENAME,Original=NORMALSAMPLENAME>

    * population: Fixed a rare case where sometimes after complex call
    simplification, the only sample genotype containing a non-ref allele
    was a member of the pedigree not being output, and in this case the
    QUAL score was the 10log10 prob(no variant) rather than 10log10
    prob(variant) as required by the VCF specification. This has been
    addressed.

    * vcfmerge: Added a new flag --force-merge-all to always attempt to
    merge headers containing conflicting descriptions.

    * vcfmerge: Previously vcfmerge would not process records containing
    symbolic alleles. These are now accepted.

    * vcfmerge: More graceful handling when encountering records with a GT
    that refers to a non-existent ALT.

    * vcfeval: Now outputs a summary containing various accuracy
    metrics. A first set of statistics is computed from the full set of
    variants evaluated (these will typically have highest sensitivity
    but potentially poor precision if the input call set has not been
    filtered). A second set of statistics is computed based on the ROC
    curve information, selected at a threshold which maximises the
    F-measure statistic (this provides some balance between sensitivity
    and precision, so may be a fairer point to gather statistics for
    cross-caller comparison).

    * vcfeval: The weighted_roc.tsv file now includes columns containing
    additional accuracy metrics.

    * vcfeval: Improved the detection that alerts the user when chromosome
    names are incompatible between reference, baseline, calls, and bed
    regions (if used). Improvements to other error and warning messages.

    * vcfeval: Added a new flag --bed-regions to supply a BED file
    containing a list of regions that the VCF records must overlap with
    in order to be included in analysis. For example, a common use case
    is to restrict to only evaluating calls contained within the GIAB
    high-confidence regions, or only within regions corresponding to
    exome target regions.

    * vcfeval: Added a new flag --region to specify a single region to
    evaluate variants within. This is useful when evaluating calls on a
    single chromosome or within a small region of interest.

    * vcfeval: Fixed a case where a ref-only call (i.e. containing no
    alts) could get output instead of an indel with a padding base at
    the same position.

    * vcfeval: Disabled the output of slope analysis data files by default,
    as these are fairly special purpose (primary ROC files are still
    output). They can be re-enabled if desired by using the new
    expert/experimental flag --Xslope-files.

    * vcffilter: The --remove-all-same-as-ref flag now does not consider a
    sample with missing GT as being variant, since the intent of this
    flag is to retain only records where at least one sample is called
    as variant.

    * vcfannotate: Added two new flags --info-id and --info-description to
    allow specifying the name of the INFO ID and Description fields
    added to the header during annotation. These flags only take effect
    if the VCF header does not already contain an INFO declaration with
    that ID.

    ### Metagenomics

    * taxfilter: Added a new flag --subtree which allows selecting entire
    taxonomic subtrees for inclusion in the output taxonomy.

    * taxfilter: Added a new flag --remove-sequences to allow the removal
    of sequence data associated with specific taxon ids.

    * sdf2fasta: Added a new flag --taxons to allow interpreting any
    supplied ID as a taxon ID and all sequences assigned to such taxon
    ID will be output. This provides an easy way to extract genomic
    sequence for any species from the reference SDF.

    ### Other

    * genomesim: Added a new flag --prefix to specify a prefix for
    generated sequence names.

    * many: Update the base library used for SAM/BAM input and output to
    htsjdk 1.128.

    * many: VCF reading now detects cases where a header specifies a field
    declaration using an ID that is already in use, preventing duplicate
    header declarations.

    * extract: Fix a regression where extracting from VCF without any
    region specified would include the VCF header.
    Len Trigg, Ph.D.
    Real Time Genomics
    www.realtimegenomics.com

  • #2
    New stable releases are now available which include minor improvements and bug fixes.

    The first of these is our full analysis suite, RTG Core 3.5.1. The changes in this version are listed below. Commercial users may download the update from our website at http://realtimegenomics.com/products/rtg-core-downloads. Non-commercial users can download the update from our website at http://realtimegenomics.com/products...non-commercial or build from the updated source code on github at https://github.com/RealTimeGenomics/rtg-core.

    We have also produced updated builds of our utilities package, RTG Tools 3.5.1, which is made freely available for non-commercial or commercial use alike. More information and download links are available from our website at http://realtimegenomics.com/products/rtg-tools or build from the updated source on github at https://github.com/RealTimeGenomics/rtg-tools.


    RTG Core 3.5.1 (2015-09-07)
    ---------------------------

    This release primarily includes bugfixes and minor improvements:

    * coverage: Fix an exception that could occur if running with a
    reference SDF supplied that had chromosomes in a different order
    compared to the BAM sequence dictionary (typically this could occur
    when running coverage on third-party BAMs)

    * extract: When extracting multiple regions these regions are now
    sorted.

    * vcfeval: When an entire chromosome contained only baseline or only
    called variants, the summary statistics for FP/FN were not being
    incremented correctly.

    * vcfeval: Fixed a case where path-finding could get confused and drop
    variants.

    * vcfeval: Speed improvement in post-processing.

    * many: Improved error reporting for commands that involve processing
    multiple BAM files, so that the name of the particular file causing
    the problem is included.

    * wrapper: Fixed the java version number check so that it works
    correctly with openjdk 1.8
    Len Trigg, Ph.D.
    Real Time Genomics
    www.realtimegenomics.com

    Comment


    • #3
      New stable releases are now available which include minor improvements and bug fixes.

      The first of these is our full analysis suite, RTG Core 3.5.2. The changes in this version are listed below. Commercial users may download the update from our website at http://realtimegenomics.com/products/rtg-core-downloads. Non-commercial users can download the update from our website at http://realtimegenomics.com/products...non-commercial or build from the updated source code on github at https://github.com/RealTimeGenomics/rtg-core.

      We have also produced updated builds of our utilities package, RTG Tools 3.5.2, which is made freely available for non-commercial or commercial use alike. More information and download links are available from our website at http://realtimegenomics.com/products/rtg-tools or build from the updated source on github at https://github.com/RealTimeGenomics/rtg-tools.


      RTG Core 3.5.2 (2015-10-15)
      ---------------------------

      This release primarily includes bugfixes and minor improvements:

      * many: When piping results from one command to another, and a later
      command closes the pipe (e.g. head), this scenario no longer
      produces an "Broken pipe" error message. This is consistent with the
      behaviour of commonly used command-line tools.

      * rocplot: Updated to handle ROC data files that contain lines with
      non-numeric score field. (In particular, future versions of vcfeval
      will include additional data-points corresponding to variants with
      no score provided)

      * rocplot: (GUI) Improvement to usability for curve renaming. Now a
      single-click in the curve title area enters edit mode, with
      RETURN/TAB to accept, ESC to cancel.

      * rocplot: (GUI) Add a button that prints an equivalent command line
      to the terminal, for easy restarting with similar state,
      particularly if curves files have been added interactively..

      * cgmap: Fix for sample sex being ignored when supplied via a pedigree
      file rather than using explicit sex flag.

      * misc: Removed vestigial (and in RTG Tools' case, incorrect)
      "Licensed to:" line from the version command output.

      * misc: Add BSD license text to the RTG Tools distributable zip.
      Len Trigg, Ph.D.
      Real Time Genomics
      www.realtimegenomics.com

      Comment

      Latest Articles

      Collapse

      • seqadmin
        Essential Discoveries and Tools in Epitranscriptomics
        by seqadmin


        The field of epigenetics has traditionally concentrated more on DNA and how changes like methylation and phosphorylation of histones impact gene expression and regulation. However, our increased understanding of RNA modifications and their importance in cellular processes has led to a rise in epitranscriptomics research. “Epitranscriptomics brings together the concepts of epigenetics and gene expression,” explained Adrien Leger, PhD, Principal Research Scientist on Modified Bases...
        Yesterday, 07:01 AM
      • seqadmin
        Current Approaches to Protein Sequencing
        by seqadmin


        Proteins are often described as the workhorses of the cell, and identifying their sequences is key to understanding their role in biological processes and disease. Currently, the most common technique used to determine protein sequences is mass spectrometry. While still a valuable tool, mass spectrometry faces several limitations and requires a highly experienced scientist familiar with the equipment to operate it. Additionally, other proteomic methods, like affinity assays, are constrained...
        04-04-2024, 04:25 PM

      ad_right_rmr

      Collapse

      News

      Collapse

      Topics Statistics Last Post
      Started by seqadmin, 04-11-2024, 12:08 PM
      0 responses
      39 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 10:19 PM
      0 responses
      41 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-10-2024, 09:21 AM
      0 responses
      35 views
      0 likes
      Last Post seqadmin  
      Started by seqadmin, 04-04-2024, 09:00 AM
      0 responses
      55 views
      0 likes
      Last Post seqadmin  
      Working...
      X