ITMO_FS.filters.sparse.MCFS

class ITMO_FS.filters.sparse.MCFS(d, k=5, p=5, scheme='dot', sigma=1)

Performs the Unsupervised Feature Selection for Multi-Cluster Data algorithm.

Parameters:
  • d (int) – Number of features to select.
  • k (int, optional) – Amount of clusters to find.
  • p (int, optional) – Amount of nearest neighbors to use while building the graph.
  • scheme (str, either '0-1', 'heat' or 'dot', optional) – Weighting scheme to use while building the graph.
  • sigma (float, optional) – Parameter for heat weighting scheme. Ignored if scheme is not ‘heat’.

Notes

For more details see this paper.

Examples

__init__(d, k=5, p=5, scheme='dot', sigma=1)

Initialize self. See help(type(self)) for accurate signature.

feature_ranking(W)

Calculate the MCFS score for a feature weight matrix.

Parameters:W (array-like, shape (n_features, k)) – Feature weight matrix.
Returns:indices – Indices of d selected features.
Return type:array-like, shape (d)
run(X, y=None)

Fits filter

Parameters:
  • X (numpy array, shape (n_samples, n_features)) – The training input samples.
  • y (numpy array, optional) – The target values (ignored).
Returns:

W – Feature weight matrix.

Return type:

array-like, shape (n_features, k)

Examples

from ITMO_FS.filters.sparse import MCFS from sklearn.datasets import make_classification import numpy as np

dataset = make_classification(n_samples=100, n_features=20, n_informative=4, n_redundant=0, shuffle=False) data, target = np.array(dataset[0]), np.array(dataset[1]) model = MCFS(d=5, k=2, scheme=’heat’) weights = model.run(data, target) print(model.feature_ranking(weights))