ITMO_FS.filters.sparse.UDFS

class ITMO_FS.filters.sparse.UDFS(p, c=5, k=5, gamma=1, l=1e-06, max_iterations=1000, epsilon=1e-05)

Performs the Unsupervised Discriminative Feature Selection algorithm.

Parameters:
  • p (int) – Number of features to select.
  • c (int, optional) – Amount of clusters to find.
  • k (int, optional) – Amount of nearest neighbors to use while building the graph.
  • gamma (float, optional) – Regularization term in the target function.
  • l (float, optional) – Parameter that controls the invertibility of the matrix used in computing of B.
  • max_iterations (int, optional) – Maximum amount of iterations to perform.
  • epsilon (positive float, optional) – Specifies the needed residual between the target functions from consecutive iterations. If the residual is smaller than epsilon, the algorithm is considered to have converged.

Notes

For more details see this paper.

Examples

__init__(p, c=5, k=5, gamma=1, l=1e-06, max_iterations=1000, epsilon=1e-05)

Initialize self. See help(type(self)) for accurate signature.

feature_ranking(W)

Calculate the UDFS score for a feature weight matrix.

Parameters:W (array-like, shape (n_features, c)) – Feature weight matrix.
Returns:indices – Indices of p selected features.
Return type:array-like, shape(p)
run(X, y=None)

Fits filter

Parameters:
  • X (numpy array, shape (n_samples, n_features)) – The training input samples.
  • y (numpy array, optional) – The target values (ignored).
Returns:

W – Feature weight matrix.

Return type:

array-like, shape (n_features, c)

Examples

>>> from ITMO_FS.filters.sparse import UDFS
>>> from sklearn.datasets import make_classification
>>> import numpy as np
>>> dataset = make_classification(n_samples=100, n_features=20, n_informative=4, n_redundant=0, shuffle=False)
>>> data, target = np.array(dataset[0]), np.array(dataset[1])
>>> model = UDFS(p=5, c=2)
>>> weights = model.run(data)
>>> print(model.feature_ranking(weights))