ITMO_FS.filters.multivariate.DISRWithMassive

class ITMO_FS.filters.multivariate.DISRWithMassive(expected_size=None)

Creates DISR (Double Input Symmetric Relevance) feature selection filter based on kASSI criterin for feature selection which aims at maximizing the mutual information avoiding, meanwhile, large multivariate density estimation. Its a kASSI criterion with approximation of the information of a set of variables by counting average information of subset on combination of two features. This formulation thus deals with feature complementarity up to order two by preserving the same computational complexity of the MRMR and CMIM criteria The DISR calculation is done using graph based solution.

Parameters:expected_size (int) – Expected size of subset of features.

Notes

For more details see this paper.

Examples

>>> from ITMO_FS.filters.multivariate import DISRWithMassive
>>> import numpy as np
>>> X = np.array([[1, 2, 3, 3, 1],[2, 2, 3, 3, 2], [1, 3, 3, 1, 3],[3, 1, 3, 1, 4],[4, 4, 3, 1, 5]], dtype = np.integer)
>>> y = np.array([1, 2, 3, 4, 5], dtype=np.integer)
>>> disr = DISRWithMassive(3)
>>> print(disr.fit_transform(X, y))
__init__(expected_size=None)

Initialize self. See help(type(self)) for accurate signature.

fit(X, y, feature_names=None)

Fits filter

Parameters:
  • X (array-like, shape (n_samples, n_features)) – The training input samples.
  • y (array-like, shape (n_samples, )) – The target values.
  • feature_names (list of strings, optional) – In case you want to define feature names
Returns:

Return type:

None

fit_transform(X, y, feature_names=None)

Fits the filter and transforms given dataset X.

Parameters:
  • X (array-like, shape (n_features, n_samples)) – The training input samples.
  • y (array-like, shape (n_samples, )) – The target values.
  • feature_names (list of strings, optional) – In case you want to define feature names
Returns:

Return type:

X dataset sliced with features selected by the filter

transform(X)

Transform given data by slicing it with selected features.

Parameters:X (array-like, shape (n_samples, n_features)) – The training input samples.
Returns:
Return type:Transformed 2D numpy array