ITMO_FS.wrappers.randomized.TPhMGWO

class ITMO_FS.wrappers.randomized.TPhMGWO(estimator, measure, wolf_number=10, seed=1, alpha=0.5, cv=5, iteration_number=30, mp=0.5, binarize='sigmoid')

Grey Wolf optimization with Two-Phase Mutation.

Parameters:
  • estimator (object) – A supervised learning estimator that should have a fit(X, y) method and a predict(X) method. The original paper suggests to use the k-nearest-neighbors classifier.
  • measure (string or callable) – A standard estimator metric (e.g. ‘f1’ or ‘roc_auc’) or a callable with signature measure(estimator, X, y) which should return only a single value.
  • wolf_number (int) – Number of search agents used to find a solution for feature selection problem.
  • seed (int) – Random seed used to initialize np.random.default_rng().
  • alpha (float) – Weight of importance of classification accuracy. Alpha is used in equation that counts fitness as fitness = alpha * score + beta * |selected_features| / |features| where alpha = 1 - beta.
  • cv (int) – Number of folds in cross-validation.
  • iteration_number (int) – Number of iterations of the algorithm.
  • mp (float) – Mutation probability.
  • binarize (str) – Transformation function to use. Currently only ‘tanh’ and ‘sigmoid’ are supported.

Notes

For more details see this paper.

Examples

>>> import numpy as np
>>> from sklearn.neighbors import KNeighborsClassifier
>>> from ITMO_FS.wrappers.randomized import TPhMGWO
>>> from sklearn.datasets import make_classification
>>> dataset = make_classification(n_samples=100, n_features=20,
... n_informative=5, n_redundant=0, shuffle=False, random_state=42)
>>> x, y = np.array(dataset[0]), np.array(dataset[1])
>>> tphmgwo = TPhMGWO(KNeighborsClassifier(n_neighbors=7),
... measure='accuracy').fit(x, y)
>>> tphmgwo.selected_features_
array([0, 1, 2, 4], dtype=int64)
__init__(estimator, measure, wolf_number=10, seed=1, alpha=0.5, cv=5, iteration_number=30, mp=0.5, binarize='sigmoid')

Initialize self. See help(type(self)) for accurate signature.

fit(X, y=None, **fit_params)

Fit the algorithm.

Parameters:
  • X (array-like, shape (n_samples, n_features)) – The training input samples.
  • y (array-like, shape (n_samples,), optional) – The class labels.
  • fit_params (dict, optional) – Additional parameters to pass to underlying _fit function.
Returns:

Return type:

Self, i.e. the transformer object.

fit_transform(X, y=None, **fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:
  • X ({array-like, sparse matrix, dataframe} of shape (n_samples, n_features)) –
  • y (ndarray of shape (n_samples,), default=None) – Target values.
  • **fit_params (dict) – Additional fit parameters.
Returns:

X_new – Transformed array.

Return type:

ndarray array of shape (n_samples, n_features_new)

get_params(deep=True)

Get parameters for this estimator.

Parameters:deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:params – Parameter names mapped to their values.
Return type:mapping of string to any
predict(X)

Predict class labels for the input data.

Parameters:X (array-like, shape (n_samples, n_features)) – The input samples.
Returns:array-like, shape (n_samples,)
Return type:class labels
set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:**params (dict) – Estimator parameters.
Returns:self – Estimator instance.
Return type:object
transform(X)

Transform given data by slicing it with selected features.

Parameters:X (array-like, shape (n_samples, n_features)) – The training input samples.
Returns:
Return type:Transformed 2D numpy array