ITMO_FS.filters.sparse.NDFS

class ITMO_FS.filters.sparse.NDFS(p, c=5, k=5, alpha=1, beta=1, gamma=1000000000.0, sigma=1, max_iterations=1000, epsilon=1e-05)

Performs the Nonnegative Discriminative Feature Selection algorithm.

Parameters:
  • p (int) – Number of features to select.
  • c (int, optional) – Amount of clusters to find.
  • k (int, optional) – Amount of nearest neighbors to use while building the graph.
  • alpha (float, optional) – Parameter in the objective function.
  • beta (float, optional) – Regularization parameter in the objective function.
  • gamma (float, optional) – Parameter in the objective function that controls the orthogonality condition.
  • sigma (float, optional) – Parameter for the weighting scheme.
  • max_iterations (int, optional) – Maximum amount of iterations to perform.
  • epsilon (positive float, optional) – Specifies the needed residual between the target functions from consecutive iterations. If the residual is smaller than epsilon, the algorithm is considered to have converged.

See also

http
//www.nlpr.ia.ac.cn/2012papers/gjhy/gh27.pdf

Examples

__init__(p, c=5, k=5, alpha=1, beta=1, gamma=1000000000.0, sigma=1, max_iterations=1000, epsilon=1e-05)

Initialize self. See help(type(self)) for accurate signature.

feature_ranking(W)

Calculate the NDFS score for a feature weight matrix.

Parameters:W (array-like, shape (n_features, c)) – Feature weight matrix.
Returns:indices – Indices of p selected features.
Return type:array-like, shape(p)
run(X, y=None)

Fits filter

Parameters:
  • X (numpy array, shape (n_samples, n_features)) – The training input samples.
  • y (numpy array, shape (n_samples) or (n_samples, n_classes), optional) – The target values or their one-hot encoding that are used to compute F. If not present, a k-means clusterization algorithm is used. If present, n_classes should be equal to c.
Returns:

W – Feature weight matrix.

Return type:

array-like, shape (n_features, c)

Examples

>>> from ITMO_FS.filters.sparse import NDFS
>>> from sklearn.datasets import make_classification
>>> import numpy as np
>>> dataset = make_classification(n_samples=100, n_features=20, n_informative=4, n_redundant=0, shuffle=False)
>>> data, target = np.array(dataset[0]), np.array(dataset[1])
>>> model = NDFS(p=5, c=2)
>>> weights = model.run(data)
>>> print(model.feature_ranking(weights))