ITMO_FS.filters.multivariate.DCSF

ITMO_FS.filters.multivariate.DCSF(selected_features, free_features, X, y)

Dynamic change of selected feature with the class scoring criterion. DCSF employs both mutual information and conditional mutual information to find an optimal subset of features. Given set of already selected features and set of remaining features on dataset X with labels y selects next feature.

Parameters:
  • selected_features (list of ints,) – already selected features
  • free_features (list of ints) – free features
  • X (array-like, shape (n_samples, n_features)) – The training input samples.
  • y (array-like, shape (n_samples, )) – The target values.

Notes

For more details see this paper.

Examples

>>> from ITMO_FS.filters.multivariate import DCSF
>>> from sklearn.datasets import make_classification
>>> from sklearn.preprocessing import KBinsDiscretizer
>>> import numpy as np
>>> dataset = make_classification(n_samples=100, n_features=20, n_informative=4, n_redundant=0, shuffle=False)
>>> est = KBinsDiscretizer(n_bins=10, encode='ordinal')
>>> data, target = np.array(dataset[0]), np.array(dataset[1])
>>> est.fit(data)
>>> data = est.transform(data)
>>> selected_features = [1, 2]
>>> other_features = [i for i in range(0, data.shape[1]) if i not in selected_features]
>>> print(DCSF(np.array(selected_features), np.array(other_features), data, target))