ITMO_FS.filters.univariate.reliefF_measure

ITMO_FS.filters.univariate.reliefF_measure(x, y, k_neighbors=1)

Calculate ReliefF measure for each feature. Bigger values mean more important features.

Note: Only for complete x Rather than repeating the algorithm m(TODO Ask Nikita about user defined) times, implement it exhaustively (i.e. n times, once for each instance) for relatively small n (up to one thousand).

Calculates spearman correlation for each feature. Spearman’s correlation assesses monotonic relationships (whether linear or not). If there are no repeated data values, a perfect Spearman correlation of +1 or −1 occurs when each of the variables is a perfect monotone function of the other.

Parameters:
  • x (array-like, shape (n_samples, n_features)) – The input samples.
  • y (array-like, shape (n_samples,)) – The classes for the samples.
  • k_neighbors (int, optional) – The number of neighbors to consider when assigning feature importance scores. More neighbors results in more accurate scores but takes longer. Selection of k hits and misses is the basic difference to Relief and ensures greater robustness of the algorithm concerning noise.
Returns:

array-like, shape (n_features,)

Return type:

feature scores

See also

R.J.(), review.()

Examples

>>> from ITMO_FS.filters.univariate import reliefF_measure
>>> import numpy as np
>>> x = np.array([[3, 3, 3, 2, 2], [3, 3, 1, 2, 3], [1, 3, 5, 1, 1],
... [3, 1, 4, 3, 1], [3, 1, 2, 3, 1], [1, 2, 1, 4, 2], [4, 3, 2, 3, 1]])
>>> y = np.array([1, 2, 2, 1, 2, 1, 2])
>>> reliefF_measure(x, y)
array([-0.14285714, -0.57142857,  0.10714286, -0.14285714,  0.07142857])
>>> reliefF_measure(x, y, k_neighbors=2)
array([-0.07142857, -0.17857143, -0.07142857, -0.0952381 , -0.17857143])