ITMO_FS.ensembles.model_based.BestSum¶
-
class
ITMO_FS.ensembles.model_based.BestSum(models, cutting_rule, weight_func, metric='f1_micro', cv=3)¶ Best weighted sum ensemble. The ensemble fits the input models and computes the feature scores as the weighted sum of the models’ feature scores and some performance metric (e.g. accuracy)
Parameters: - models (collection) – Collection of model objects. Models should have a fit(X, y) method and a field corresponding to feature weights.
- cutting_rule (string or callable) – A cutting rule name defined in GLOB_CR or a callable with signature cutting_rule (features), which should return a list features ranked by some rule.
- weight_func (callable) – The function to extract weights from the model.
- metric (string or callable) – A standard estimator metric (e.g. ‘f1’ or ‘roc_auc’) or a callable object / function with signature measure(estimator, X, y) which should return only a single value.
- cv (int) – Number of folds in cross-validation.
See also
Jeon,H.,S.,Feature,10,3211.Examples
>>> from ITMO_FS.ensembles import BestSum >>> from sklearn.svm import SVC >>> from sklearn.linear_model import LogisticRegression >>> from sklearn.linear_model import RidgeClassifier >>> import numpy as np >>> models = [SVC(kernel='linear'), ... LogisticRegression(), ... RidgeClassifier()] >>> x = np.array([[3, 3, 3, 2, 2], [3, 3, 1, 2, 3], [1, 3, 5, 1, 1], ... [3, 1, 4, 3, 1], [3, 1, 2, 3, 1]]) >>> y = np.array([1, 2, 2, 1, 2]) >>> bs = BestSum(models, ("K best", 2), ... lambda model: np.square(model.coef_).sum(axis=0), cv=2).fit(x, y) >>> bs.selected_features_ array([0, 2], dtype=int64)
-
__init__(models, cutting_rule, weight_func, metric='f1_micro', cv=3)¶ Initialize self. See help(type(self)) for accurate signature.
-
fit(X, y=None, **fit_params)¶ Fit the algorithm.
Parameters: - X (array-like, shape (n_samples, n_features)) – The training input samples.
- y (array-like, shape (n_samples,), optional) – The class labels.
- fit_params (dict, optional) – Additional parameters to pass to underlying _fit function.
Returns: Return type: Self, i.e. the transformer object.
-
fit_transform(X, y=None, **fit_params)¶ Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
Parameters: - X ({array-like, sparse matrix, dataframe} of shape (n_samples, n_features)) –
- y (ndarray of shape (n_samples,), default=None) – Target values.
- **fit_params (dict) – Additional fit parameters.
Returns: X_new – Transformed array.
Return type: ndarray array of shape (n_samples, n_features_new)
-
get_params(deep=True)¶ Get parameters for this estimator.
Parameters: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators. Returns: params – Parameter names mapped to their values. Return type: mapping of string to any
-
set_params(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>so that it’s possible to update each component of a nested object.Parameters: **params (dict) – Estimator parameters. Returns: self – Estimator instance. Return type: object
-
transform(X)¶ Transform given data by slicing it with selected features.
Parameters: X (array-like, shape (n_samples, n_features)) – The training input samples. Returns: Return type: Transformed 2D numpy array