Skip to content

Interface

Here we define the abstract supertypes that all outlier detectors share as well as useful datatypes use throughout OutlierDetectionJL and the fit and transform methods, that have to be implemented for each detector.

Detectors

Detector

# OutlierDetectionInterface.DetectorType.

Detector

The union type of all detectors, including supervised, semi-supervised and unsupervised detectors. Note: A semi-supervised detector can be seen as a supervised detector with missing labels denoting unlabeled data.

SupervisedDetector

# OutlierDetectionInterface.SupervisedDetectorType.

SupervisedDetector

This abstract type forms the basis for all implemented supervised outlier detection algorithms.

UnsupervisedDetector

# OutlierDetectionInterface.UnsupervisedDetectorType.

UnsupervisedDetector

This abstract type forms the basis for all implemented unsupervised outlier detection algorithms.

Data types

DetectorModel

# OutlierDetectionInterface.DetectorModelType.

DetectorModel

A DetectorModel represents the learned behaviour for specific Detector. This might include parameters in parametric models or other repesentations of the learned data in nonparametric models. In essence, it includes everything required to transform an instance to an outlier score.

Scores

# OutlierDetectionInterface.ScoresType.

Scores::AbstractVector{<:Real}

Scores are continuous values, where the range depends on the specific detector yielding the scores. Note: All detectors return increasing scores and higher scores are associated with higher outlierness.

Data

# OutlierDetectionInterface.DataType.

Data::AbstractArray{<:Real}

The raw input data for every detector is defined asAbstractArray{<:Real} and should be a one observation per last dimension in an n-dimensional array.

Label

# OutlierDetectionInterface.LabelsType.

Labels::AbstractVector{<:Union{Missing,CategoricalValue{<:T,<:Integer}}} where {T}

Labels are used for supervision and evaluation and are defined as an (categorical) vectors of strings. The convention for labels is that "outlier" indicates outliers, "normal" indicates inliers and missing indicates unlabeled data.

Fit

# OutlierDetectionInterface.FitType.

Fit::Tuple{T, Scores} where {T}

A fit results in a learned model of type T <: Any and the observed training scores of type Scores.

FitResult

# OutlierDetectionInterface.FitResultType.

FitResult

A structured fit result used as a fit return type MLJ bundling a model and the observed training scores of type Scores.

Functions

fit

# OutlierDetectionInterface.fitFunction.

fit(detector,
    X,
    y;
    verbosity)

Fit an unsupervised, supervised or semi-supervised outlier detector. That is, learn a DetectorModel from input data X and, in the supervised and semi-supervised setting, labels y. In a supervised setting, the label "outlier" represents outliers and "normal" inliers. In a semi-supervised setting, missing additionally represents unlabeled data. Note: Unsupervised detectors can be fitted without specifying y.

Parameters

detector::Detector

Any UnsupervisedDetector or SupervisedDetector implementation.

X::AbstractArray{<:Real}

An array of real values with one observation per last axis.

Returns

fit::Fit

The learned model of the given detector, which contains all the necessary information for later prediction and the achieved outlier scores of the given input data X.

Examples

using OutlierDetection: KNNDetector, fit, transform
detector = KNNDetector()
X = rand(10, 100)
model, result = fit(detector, X; verbosity=0)
test_scores = transform(detector, model, X)

transform

# OutlierDetectionInterface.transformFunction.

transform(detector,
          model,
          X)

Transform input data X to outlier scores using an UnsupervisedDetector or SupervisedDetector and a corresponding DetectorModel.

Parameters

detector::Detector

Any UnsupervisedDetector or SupervisedDetector implementation.

model::DetectorModel

The model learned from using fit with a Detector

X::AbstractArray{<:Real}

An array of real values with one observation per last axis.

Returns

result::Scores

Tuple of the achieved outlier scores of the given train and test data.

Examples

using OutlierDetection: KNNDetector, fit, transform
detector = KNNDetector()
X = rand(10, 100)
model, result = fit(detector, X; verbosity=0)
test_scores = transform(detector, model, X)

Macros

@detector

# OutlierDetectionInterface.@detectorMacro.

@detector(expr)

An alternative to declaring the detector struct, clean! method and keyword constructor, direcly referring to MLJModelInterface.@mlj_model.

Parameters

expr::Expr

An expression of a mutable struct defining a detector's hyperparameters.

@default_frontend

# OutlierDetectionInterface.@default_frontendMacro.

@default_frontend(detector)

Define a data front end for a given detector, which transforms the input data to OutlierDetectionInterface.Data.

Parameters

detector::T where T<:Detector

The detector datatype for which the data frontend should be defined.

@default_metadata

# OutlierDetectionInterface.@default_metadataMacro.

@default_metadata(detector,
                  uuid)

Define the default metadata for a given detector, which is useful when a detector is exported into MLJModels, such that it can be directly loaded with MLJ. By default, we assume that a detector is exported on a package's top-level and we set the load_path accordingly.

Additionally, we assume the following metadata defaults:

  • package_name is equal to the @__MODULE__, where @default_metadata is used
  • The detector is implemented in julia, is_pure_julia=true
  • The detector is no wrapper, is_wrapper=false
  • The package lives in the OutlierDetectionJL github organization

Parameters

detector::T where T<:Detector

The detector datatype for which the data frontend should be defined.

uuid::String

The UUID of the detector's package.