Interface
Here we define the abstract supertypes that all outlier detectors share as well as useful datatypes use throughout OutlierDetectionJL
and the fit
and transform
methods, that have to be implemented for each detector.
Detectors
Detector
#
OutlierDetectionInterface.Detector
— Type.
Detector
The union type of all detectors, including supervised, semi-supervised and unsupervised detectors. Note: A semi-supervised detector can be seen as a supervised detector with missing
labels denoting unlabeled data.
SupervisedDetector
#
OutlierDetectionInterface.SupervisedDetector
— Type.
SupervisedDetector
This abstract type forms the basis for all implemented supervised outlier detection algorithms.
UnsupervisedDetector
#
OutlierDetectionInterface.UnsupervisedDetector
— Type.
UnsupervisedDetector
This abstract type forms the basis for all implemented unsupervised outlier detection algorithms.
Data types
DetectorModel
#
OutlierDetectionInterface.DetectorModel
— Type.
DetectorModel
A DetectorModel
represents the learned behaviour for specific Detector
. This might include parameters in parametric models or other repesentations of the learned data in nonparametric models. In essence, it includes everything required to transform an instance to an outlier score.
Scores
#
OutlierDetectionInterface.Scores
— Type.
Scores::AbstractVector{<:Real}
Scores are continuous values, where the range depends on the specific detector yielding the scores. Note: All detectors return increasing scores and higher scores are associated with higher outlierness.
Data
#
OutlierDetectionInterface.Data
— Type.
Data::AbstractArray{<:Real}
The raw input data for every detector is defined asAbstractArray{<:Real}
and should be a one observation per last dimension in an n-dimensional array.
Label
#
OutlierDetectionInterface.Labels
— Type.
Labels::AbstractVector{<:Union{Missing,CategoricalValue{<:T,<:Integer}}} where {T}
Labels are used for supervision and evaluation and are defined as an (categorical) vectors of strings. The convention for labels is that "outlier"
indicates outliers, "normal"
indicates inliers and missing
indicates unlabeled data.
Fit
#
OutlierDetectionInterface.Fit
— Type.
Fit::Tuple{DetectorModel, Scores}
A fit results in a learned model of type DetectorModel
and the observed training scores of type Scores
.
Functions
fit
#
OutlierDetectionInterface.fit
— Function.
fit(detector,
X,
y;
verbosity)
Fit an unsupervised, supervised or semi-supervised outlier detector. That is, learn a DetectorModel
from input data X
and, in the supervised and semi-supervised setting, labels y
. In a supervised setting, the label "outlier"
represents outliers and "normal"
inliers. In a semi-supervised setting, missing
additionally represents unlabeled data. Note: Unsupervised detectors can be fitted without specifying y
.
Parameters
detector::Detector
Any UnsupervisedDetector
or SupervisedDetector
implementation.
X::AbstractArray{<:Real}
An array of real values with one observation per last axis.
Returns
fit::Fit
The learned model of the given detector, which contains all the necessary information for later prediction and the achieved outlier scores of the given input data X
.
Examples
using OutlierDetection: KNNDetector, fit, score
detector = KNNDetector()
X = rand(10, 100)
result = fit(detector, X)
test_scores = transform(detector, result.model, X)
transform
#
OutlierDetectionInterface.transform
— Function.
transform(detector,
model,
X)
Transform input data X
to outlier scores using an UnsupervisedDetector
or SupervisedDetector
and a corresponding DetectorModel
.
Parameters
detector::Detector
Any UnsupervisedDetector
or SupervisedDetector
implementation.
model::DetectorModel
The model learned from using fit
with a Detector
X::AbstractArray{<:Real}
An array of real values with one observation per last axis.
Returns
result::Scores
Tuple of the achieved outlier scores of the given train and test data.
Examples
using OutlierDetection: KNNDetector, fit, score
detector = KNNDetector()
X = rand(10, 100)
result = fit(detector, X)
test_scores = transform(detector, result.model, X)
Macros
@detector
#
OutlierDetectionInterface.@detector
— Macro.
@detector(expr)
An alternative to declaring the detector struct, clean! method and keyword constructor, direcly referring to MLJModelInterface.@mlj_model.
Parameters
expr::Expr
An expression of a mutable struct defining a detector's hyperparameters.
@default_frontend
#
OutlierDetectionInterface.@default_frontend
— Macro.
@default_frontend(detector)
Define a data front end for a given detector, which transforms the input data to OutlierDetectionInterface.Data
.
Parameters
detector::T where T<:Detector
The detector datatype for which the data frontend should be defined.
@default_metadata
#
OutlierDetectionInterface.@default_metadata
— Macro.
@default_metadata(detector,
uuid)
Define the default metadata for a given detector, which is useful when a detector is exported into MLJModels, such that it can be directly loaded with MLJ. By default, we assume that a detector is exported on a package's top-level and we set the load_path
accordingly.
Additionally, we assume the following metadata defaults:
package_name
is equal to the@__MODULE__
, where@default_metadata
is used- The detector is implemented in julia,
is_pure_julia=true
- The detector is no wrapper,
is_wrapper=false
- The package lives in the
OutlierDetectionJL
github organization
Parameters
detector::T where T<:Detector
The detector datatype for which the data frontend should be defined.
uuid::String
The UUID of the detector's package.