ITMO_FS.wrappers.randomized.SimulatedAnnealing

class ITMO_FS.wrappers.randomized.SimulatedAnnealing(classifier, score, seed=1, iteration_number=100, c=1, init_number_of_features=None)

Performs feature selection using simulated annealing

Parameters:
  • seed (integer) – Random seed used to initialize np.random.seed()
  • iteration_number (integer) – number of iterations of algorithm
  • classifier (Classifier instance) –

    Classifier used for training and testing on provided datasets.

    • Note that algorithm implementation assumes that classifier has fit, predict methods. Default algorithm uses sklearn.neighbors.KNeighborsClassifier
  • c (integer) – constant c is used t o control the rate of feature perturbation
  • init_number_of_features (float) – number of features to initialize start features subset, Note: by default (5-10) percents of number of features is used

Notes

For more details see this paper.

Examples

>>> from sklearn.datasets import make_classification
>>> from sklearn.model_selection import KFold
>>> from ITMO_FS.wrappers.randomized import SimulatedAnnealing
>>> x, y = make_classification(1000, 100, n_informative = 10, n_redundant = 30, n_repeated = 10, shuffle = False)
>>> kf = KFold(n_splits=2)
>>> sa = SimulatedAnnealing()
>>> for train_index, test_index in kf.split(x):
...    sa.fit(x[train_index], y[train_index], x[test_index], y[test_index])
...    print(sa.selected_features)
__init__(classifier, score, seed=1, iteration_number=100, c=1, init_number_of_features=None)

Initialize self. See help(type(self)) for accurate signature.

fit(train_x, train_y, test_x, test_y)

Runs the Simulated Annealing algorithm on the specified dataset and fits the classifier.

Parameters:
  • train_x (array-like, shape (n_samples, n_features)) – The input training samples.
  • train_y (array-like, shape (n_samples)) – The classes for training samples.
  • test_x (array-like, shape (n_samples, n_features)) – The input testing samples.
  • test_y (array-like, shape (n_samples)) – The classes for testing samples.
Returns:

Return type:

None

predict(test_x)

Predicts labels on test dataset

Parameters:test_x (array-like, shape (n_samples, n_features)) – The input testing samples.
Returns:array-like, shape (n_samples,n_selected_features)
Return type:array of feature numbers