recpack.algorithms.stopping_criterion.StoppingCriterion
- class recpack.algorithms.stopping_criterion.StoppingCriterion(loss_function: Callable, minimize: bool = False, stop_early: bool = False, max_iter_no_change: int = 5, min_improvement: float = 0.0, **kwargs)
StoppingCriterion provides a wrapper around any loss function used in the validation stage of an iterative algorithm.
A loss function can be maximized or minimized. If stop_early is true, then an EarlyStoppingException is raised when there were at least
max_iter_no_change
iterations with no improvements greater than {min_improvement}.Example:
import numpy as np from scipy.sparse import csr_matrix from recpack.algorithms.stopping_criterion import StoppingCriterion X_true = csr_matrix(np.array([[1, 0, 1], [1, 1, 0], [1, 1, 0]])) X_pred = csr_matrix(np.array([[.9, .5, .2], [.3, .2, .05], [.1, .1, .9]])) # Creating a simple loss function # That computes the sum of the absolute error at each position in the matrix def my_loss(X_true, X_pred): error = np.abs(X_true - X_pred) return np.sum(error) # construct StoppingCriterion sc = StoppingCriterion(my_loss, minimize=True) # Update stopping criterion better = sc.update(X_true, X_pred) # Since this was the first update, it will be better than what was before assert better # Trying again with the same input, will give no improvement better = sc.update(X_true, X_pred) assert not better # Constructing a better prediction matrix # This would usually be done by a learning algorithm X_pred = csr_matrix(np.array([[.9, .3, .8], [.4, .5, .05], [.4, .4, .5]])) better = sc.update(X_true, X_pred) assert better # The best value for the loss function can also be retrieved print(sc.best_value)
- Parameters
loss_function (Callable) – Metric function used in validation, function should take two positional arguments
X_true
andX_pred
, and should return a single float value, which will be optimised.minimize (bool, optional) – True if smaller values of loss_function are better, defaults to False.
stop_early (bool, optional) – Use early stopping to halt learning when overfitting, defaults to False
max_iter_no_change (int, optional) – Amount of iterations with no improvements greater than
min_improvement
before we stop early, defaults to 5min_improvement (float, optional) – Improvements smaller than min_improvement are not counted as actual improvements, defaults to 0.01
kwargs (dict, optional) – The keyword arguments to be passed to the loss function
Methods
create
(criterion_name, **kwargs)Construct a StoppingCriterion instance, based on the name of the loss function.
update
(X_true, X_pred)Update StoppingCriterion value based on expected and predicted interactions.
Attributes
Available loss function options.
- FUNCTIONS = {'bpr': {'batch_size': 1000, 'loss_function': <function bpr_loss_wrapper>, 'minimize': True}, 'ndcg': {'k': 50, 'loss_function': <function ndcg_k>, 'minimize': False}, 'precision': {'loss_function': <function precision_k>, 'minimize': False}, 'recall': {'k': 50, 'loss_function': <function recall_k>, 'minimize': False}, 'warp': {'loss_function': <function warp_loss_wrapper>, 'minimize': True}}
Available loss function options.
These values can be passed to the
create
method, and will create the corresponding stopping criterion.Available loss functions:
bpr : Bayesian Personalised Ranking loss, will be minimized.
warp : Weighted Approximate-Rank Pairwise loss, will be minimized.
recall: Recall@k metric, will be maximized.
ndcg: Normalized Discounted Cumulative Gain, will be maximized.
- classmethod create(criterion_name: str, **kwargs) recpack.algorithms.stopping_criterion.StoppingCriterion
Construct a StoppingCriterion instance, based on the name of the loss function.
BPR and WARP loss will minimize a loss function, Recall and NormalizedDiscountedCumulativeGain will optimise a ranking metric, with default @K=50
keyword arguments of the criteria can be set by passing them as kwarg to this create function.
Example:
import numpy as np from scipy.sparse import csr_matrix from recpack.algorithms.stopping_criterion import StoppingCriterion X_true = csr_matrix(np.array([[1, 0, 1], [1, 1, 0], [1, 1, 0]])) X_pred = csr_matrix(np.array([[.9, .5, .2], [.3, .2, .05], [.1, .1, .9]])) # construct StoppingCriterion # setting k to 2, such that when ndcg function is called, # k=2 will be passed to it sc = StoppingCriterion.create('ndcg', k=2) # Update gives us if it was better or not better = sc.update(X_true, X_pred)
- Parameters
criterion_name (str) – Name of the criterion to use, one of [“bpr”, “warp”, “recall”, “ndcg”]
kwargs – Keyword arguments to pass to the criterion when calling it, useful to pass hyperparameters to the loss functions.
- Raises
ValueError – If the requested criterion is not one of tha allowed options.
- Returns
The constructed stopping criterion
- Return type
- update(X_true: scipy.sparse._csr.csr_matrix, X_pred: scipy.sparse._csr.csr_matrix) bool
Update StoppingCriterion value based on expected and predicted interactions.
The two matrices and kwargs set during init of StoppingCriterion are passed to the value function, the computed loss or metric is compared to the previous best value. If new value is better (bigger if
minimize == False
, smaller otherwise), the new value is marked as best, and True is returned.Logs an info statement with the computed loss value, and if it was better.
- Raises
EarlyStoppingException – When early stopping condition is met, if early stopping is enabled.
- Returns
True if value is better than the previous best value, False if not.
- Return type
bool