recpack.algorithms.stopping_criterion.StoppingCriterion

class recpack.algorithms.stopping_criterion.StoppingCriterion(loss_function: Callable, minimize: bool = False, stop_early: bool = False, max_iter_no_change: int = 5, min_improvement: float = 0.0, **kwargs)

StoppingCriterion provides a wrapper around any loss function used in the validation stage of an iterative algorithm.

A loss function can be maximized or minimized. If stop_early is true, then an EarlyStoppingException is raised when there were at least max_iter_no_change iterations with no improvements greater than {min_improvement}.

Example:

import numpy as np
from scipy.sparse import csr_matrix
from recpack.algorithms.stopping_criterion import StoppingCriterion
X_true = csr_matrix(np.array([[1, 0, 1], [1, 1, 0], [1, 1, 0]]))
X_pred = csr_matrix(np.array([[.9, .5, .2], [.3, .2, .05], [.1, .1, .9]]))

# Creating a simple loss function
# That computes the sum of the absolute error at each position in the matrix
def my_loss(X_true, X_pred):
    error = np.abs(X_true - X_pred)
    return np.sum(error)

# construct StoppingCriterion
sc = StoppingCriterion(my_loss, minimize=True)

# Update stopping criterion
better = sc.update(X_true, X_pred)

# Since this was the first update, it will be better than what was before
assert better

# Trying again with the same input, will give no improvement
better = sc.update(X_true, X_pred)
assert not better

# Constructing a better prediction matrix
# This would usually be done by a learning algorithm
X_pred = csr_matrix(np.array([[.9, .3, .8], [.4, .5, .05], [.4, .4, .5]]))
better = sc.update(X_true, X_pred)
assert better

# The best value for the loss function can also be retrieved
print(sc.best_value)

Parameters

loss_function (Callable) – Metric function used in validation, function should take two positional arguments X_true and X_pred, and should return a single float value, which will be optimised.
minimize (bool, optional) – True if smaller values of loss_function are better, defaults to False.
stop_early (bool, optional) – Use early stopping to halt learning when overfitting, defaults to False
max_iter_no_change (int, optional) – Amount of iterations with no improvements greater than min_improvement before we stop early, defaults to 5
min_improvement (float, optional) – Improvements smaller than min_improvement are not counted as actual improvements, defaults to 0.01
kwargs (dict, optional) – The keyword arguments to be passed to the loss function

Methods

`create`(criterion_name, **kwargs)	Construct a StoppingCriterion instance, based on the name of the loss function.
`update`(X_true, X_pred)	Update StoppingCriterion value based on expected and predicted interactions.

Attributes

FUNCTIONS

Available loss function options.

FUNCTIONS = {'bpr': {'batch_size': 1000, 'loss_function': <function bpr_loss_wrapper>, 'minimize': True}, 'ndcg': {'k': 50, 'loss_function': <function ndcg_k>, 'minimize': False}, 'precision': {'loss_function': <function precision_k>, 'minimize': False}, 'recall': {'k': 50, 'loss_function': <function recall_k>, 'minimize': False}, 'warp': {'loss_function': <function warp_loss_wrapper>, 'minimize': True}}

Available loss function options.

These values can be passed to the create method, and will create the corresponding stopping criterion.

Available loss functions:

bpr : Bayesian Personalised Ranking loss, will be minimized.
warp : Weighted Approximate-Rank Pairwise loss, will be minimized.
recall: Recall@k metric, will be maximized.
ndcg: Normalized Discounted Cumulative Gain, will be maximized.

classmethod create(criterion_name: str, **kwargs) → recpack.algorithms.stopping_criterion.StoppingCriterion

Construct a StoppingCriterion instance, based on the name of the loss function.

BPR and WARP loss will minimize a loss function, Recall and NormalizedDiscountedCumulativeGain will optimise a ranking metric, with default @K=50

keyword arguments of the criteria can be set by passing them as kwarg to this create function.

Example:

import numpy as np
from scipy.sparse import csr_matrix
from recpack.algorithms.stopping_criterion import StoppingCriterion
X_true = csr_matrix(np.array([[1, 0, 1], [1, 1, 0], [1, 1, 0]]))
X_pred = csr_matrix(np.array([[.9, .5, .2], [.3, .2, .05], [.1, .1, .9]]))

# construct StoppingCriterion
# setting k to 2, such that when ndcg function is called,
# k=2 will be passed to it
sc = StoppingCriterion.create('ndcg', k=2)

# Update gives us if it was better or not
better = sc.update(X_true, X_pred)

Parameters

criterion_name (str) – Name of the criterion to use, one of [“bpr”, “warp”, “recall”, “ndcg”]
kwargs – Keyword arguments to pass to the criterion when calling it, useful to pass hyperparameters to the loss functions.

Raises

ValueError – If the requested criterion is not one of tha allowed options.

Returns

The constructed stopping criterion

Return type

StoppingCriterion

update(X_true: scipy.sparse._csr.csr_matrix, X_pred: scipy.sparse._csr.csr_matrix) → bool

Update StoppingCriterion value based on expected and predicted interactions.

The two matrices and kwargs set during init of StoppingCriterion are passed to the value function, the computed loss or metric is compared to the previous best value. If new value is better (bigger if minimize == False, smaller otherwise), the new value is marked as best, and True is returned.

Logs an info statement with the computed loss value, and if it was better.

Raises: EarlyStoppingException – When early stopping condition is met, if early stopping is enabled.
Returns: True if value is better than the previous best value, False if not.
Return type: bool