recpack.scenarios
The scenarios module contains many of the most commonly encountered evaluation scenarios in recommendation.
A scenario consists of a training and test dataset, and sometimes also a validation dataset. Both validation and test dataset are made up of two components: a fold-in set of interactions that is used to predict another held-out set of interactions.
Each scenario describes a complex situation, e.g. “Train on all user interactions before time T, predict interactions after T+10 using interactions from T+5 until T+10”.
|
Base class for defining an evaluation scenario. |
|
Predict users' future interactions, given information about historical interactions. |
|
Predict a user's next interaction. |
|
Predict users’ last interaction, given information about historical interactions. |
|
Predict (randomly) held-out interactions for all users, with remaining data used for training. |
|
Predict (randomly) held-out interactions of previously unseen users. |
|
Predict future interactions for previously unseen users. |
|
Predict the next interaction(s) for previously unseen users. |
A scenario is stateful. At initialization the parameters for the scenario are passed.
Only after calling Scenario.split
given a recpack.matrix.InteractionMatrix
,
can splits be retrieved under
Scenario.full_training_data
, Scenario.validation_training_data
,
Scenario.validation_data
and Scenario.test_data
.
Splitters
Splitters are the building blocks for the scenarios. A splitter performs a simple split into two InteractionMatrices according to one, simple criterion, e.g. “Fold in is all interactions before T, hold out all interactions after T”.
Should you want to implement a new scenario that is not yet supported, these splitters facilitate easy implementation.
|
Split data by the user identifiers of the interactions. |
|
Split data randomly, such that |
|
Split data so that the first return value contains interactions in |
|
Randomly splits the users into two sets so that interactions for a user will always occur only in one split. |
Split users based on the time of their most recent interactions. |
|
Splits the n most recent interactions of a user into the second return value, and earlier interactions into the first. |