ITMO_FS.filters.univariate.UnivariateFilter¶
-
class
ITMO_FS.filters.univariate.UnivariateFilter(measure, cutting_rule=('Best by percentage', 1.0))¶ Basic interface for using univariate measures for feature selection. List of available measures is in ITMO_FS.filters.univariate.measures, also you can provide your own measure but it should suit the argument scheme for measures, i.e. take two arguments x,y and return scores for all the features in dataset x. Same applies to cutting rules.
Parameters: - measure (string or callable) – A metric name defined in GLOB_MEASURE or a callable with signature measure (sample dataset, labels of dataset samples) which should return a list of metric values for each feature in the dataset.
- cutting_rule (string or callables) – A cutting rule name defined in GLOB_CR or a callable with signature cutting_rule (features) which should return a list of features ranked by some rule.
Examples
>>> import numpy as np >>> from ITMO_FS.filters.univariate import select_k_best >>> from ITMO_FS.filters.univariate import UnivariateFilter >>> from ITMO_FS.filters.univariate import f_ratio_measure >>> x = np.array([[3, 3, 3, 2, 2], [3, 3, 1, 2, 3], [1, 3, 5, 1, 1], ... [3, 1, 4, 3, 1], [3, 1, 2, 3, 1]]) >>> y = np.array([1, 3, 2, 1, 2]) >>> filter = UnivariateFilter(f_ratio_measure, ... select_k_best(2)).fit(x, y) >>> filter.selected_features_ array([4, 2], dtype=int64) >>> filter.feature_scores_ array([0.6 , 0.2 , 1. , 0.12, 5.4 ])
-
__init__(measure, cutting_rule=('Best by percentage', 1.0))¶ Initialize self. See help(type(self)) for accurate signature.
-
fit(X, y=None, **fit_params)¶ Fit the algorithm.
Parameters: - X (array-like, shape (n_samples, n_features)) – The training input samples.
- y (array-like, shape (n_samples,), optional) – The class labels.
- fit_params (dict, optional) – Additional parameters to pass to underlying _fit function.
Returns: Return type: Self, i.e. the transformer object.
-
fit_transform(X, y=None, **fit_params)¶ Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
Parameters: - X ({array-like, sparse matrix, dataframe} of shape (n_samples, n_features)) –
- y (ndarray of shape (n_samples,), default=None) – Target values.
- **fit_params (dict) – Additional fit parameters.
Returns: X_new – Transformed array.
Return type: ndarray array of shape (n_samples, n_features_new)
-
get_params(deep=True)¶ Get parameters for this estimator.
Parameters: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators. Returns: params – Parameter names mapped to their values. Return type: mapping of string to any
-
set_params(**params)¶ Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as pipelines). The latter have parameters of the form
<component>__<parameter>so that it’s possible to update each component of a nested object.Parameters: **params (dict) – Estimator parameters. Returns: self – Estimator instance. Return type: object
-
transform(X)¶ Transform given data by slicing it with selected features.
Parameters: X (array-like, shape (n_samples, n_features)) – The training input samples. Returns: Return type: Transformed 2D numpy array