ITMO_FS.filters.sparse
.MCFS¶
-
class
ITMO_FS.filters.sparse.
MCFS
(d, k=5, p=5, scheme='dot', sigma=1)¶ Performs the Unsupervised Feature Selection for Multi-Cluster Data algorithm.
Parameters: - d (int) – Number of features to select.
- k (int, optional) – Amount of clusters to find.
- p (int, optional) – Amount of nearest neighbors to use while building the graph.
- scheme (str, either '0-1', 'heat' or 'dot', optional) – Weighting scheme to use while building the graph.
- sigma (float, optional) – Parameter for heat weighting scheme. Ignored if scheme is not ‘heat’.
Notes
For more details see this paper.
Examples
-
__init__
(d, k=5, p=5, scheme='dot', sigma=1)¶ Initialize self. See help(type(self)) for accurate signature.
-
feature_ranking
(W)¶ Calculate the MCFS score for a feature weight matrix.
Parameters: W (array-like, shape (n_features, k)) – Feature weight matrix. Returns: indices – Indices of d selected features. Return type: array-like, shape (d)
-
run
(X, y=None)¶ Fits filter
Parameters: - X (numpy array, shape (n_samples, n_features)) – The training input samples.
- y (numpy array, optional) – The target values (ignored).
Returns: W – Feature weight matrix.
Return type: array-like, shape (n_features, k)
Examples
from ITMO_FS.filters.sparse import MCFS from sklearn.datasets import make_classification import numpy as np
dataset = make_classification(n_samples=100, n_features=20, n_informative=4, n_redundant=0, shuffle=False) data, target = np.array(dataset[0]), np.array(dataset[1]) model = MCFS(d=5, k=2, scheme=’heat’) weights = model.run(data, target) print(model.feature_ranking(weights))