ITMO_FS.filters.multivariate.MIFS¶
-
ITMO_FS.filters.multivariate.MIFS(selected_features, free_features, X, y, beta)¶ Mutual Information Feature Selection feature scoring criterion. This criterion includes the I(X;Y) term to ensure feature relevance, but introduces a penalty to enforce low correlations with features already selected in set. Given set of already selected features and set of remaining features on dataset X with labels y selects next feature.
Parameters: - selected_features (list of ints,) – already selected features
- free_features (list of ints) – free features
- X (array-like, shape (n_samples, n_features)) – The training input samples.
- y (array-like, shape (n_samples, )) – The target values.
- beta (float,) –
coeficient for redundancy term
Notes - ----- –
- more details see `this paper <http (For) –
Examples
>>> from ITMO_FS.filters.multivariate import MIFS >>> from sklearn.preprocessing import KBinsDiscretizer >>> import numpy as np >>> X = np.array([[1, 2, 3, 3, 1],[2, 2, 3, 3, 2], [1, 3, 3, 1, 3],[3, 1, 3, 1, 4],[4, 4, 3, 1, 5]], dtype = np.integer) >>> y = np.array([1, 2, 3, 4, 5], dtype=np.integer) >>> est = KBinsDiscretizer(n_bins=10, encode='ordinal') >>> est.fit(X) KBinsDiscretizer(encode='ordinal', n_bins=10) >>> X = est.transform(X) >>> selected_features = [1, 2] >>> other_features = [i for i in range(0, X.shape[1]) if i not in selected_features] >>> MIFS(np.array(selected_features), np.array(other_features), X, y, 0.4) array([0.91021097, 0.403807 , 1.0765663 ])