ITMO_FS.filters.multivariate
.generalizedCriteria¶
-
ITMO_FS.filters.multivariate.
generalizedCriteria
(selected_features, free_features, X, y, beta, gamma)¶ This feature scoring criteria is a linear combination of all relevance, redundancy, conditional dependency Given set of already selected features and set of remaining features on dataset X with labels y selects next feature.
Parameters: - selected_features (list of ints,) – already selected features
- free_features (list of ints) – free features
- X (array-like, shape (n_samples, n_features)) – The training input samples.
- y (array-like, shape (n_samples, )) – The target values.
- beta (float,) – coeficient for redundancy term
- gamma (float,) – coeficient for conditional dependancy term
Notes
See the original paper [1] for more details.
References
[1] Brown, Gavin et al. “Conditional Likelihood Maximisation: A Unifying Framework for Information Theoretic Feature Selection.” JMLR 2012.
Examples
>>> from ITMO_FS.filters.multivariate import CFR >>> from sklearn.datasets import make_classification >>> from sklearn.preprocessing import KBinsDiscretizer >>> import numpy as np >>> dataset = make_classification(n_samples=100, n_features=20, n_informative=4, n_redundant=0, shuffle=False) >>> est = KBinsDiscretizer(n_bins=10, encode='ordinal') >>> data, target = np.array(dataset[0]), np.array(dataset[1]) >>> est.fit(data) >>> data = est.transform(data) >>> selected_features = [1, 2] >>> other_features = [i for i in range(0, data.shape[1]) if i not in selected_features] >>> print(generalizedCriteria(np.array(selected_features), np.array(other_features), data, target, 0.4, 0.3))