recpack.scenarios

The scenarios module contains many of the most commonly encountered evaluation scenarios in recommendation.

A scenario consists of a training and test dataset, and sometimes also a validation dataset. Both validation and test dataset are made up of two components: a fold-in set of interactions that is used to predict another held-out set of interactions.

Each scenario describes a complex situation, e.g. “Train on all user interactions before time T, predict interactions after T+10 using interactions from T+5 until T+10”.

`Scenario`([validation, seed])	Base class for defining an evaluation scenario.
`Timed`(t[, t_validation, delta_out, ...])	Predict users' future interactions, given information about historical interactions.
`LastItemPrediction`([validation, seed, ...])	Predict a user's next interaction.
`TimedLastItemPrediction`(t[, t_validation, ...])	Predict users’ last interaction, given information about historical interactions.
`WeakGeneralization`([frac_data_in, ...])	Predict (randomly) held-out interactions for all users, with remaining data used for training.
`StrongGeneralization`([frac_users_train, ...])	Predict (randomly) held-out interactions of previously unseen users.
`StrongGeneralizationTimed`(frac_users_in, t)	Predict future interactions for previously unseen users.
`StrongGeneralizationTimedMostRecent`(t[, ...])	Predict the next interaction(s) for previously unseen users.

A scenario is stateful. At initialization the parameters for the scenario are passed. Only after calling Scenario.split given a recpack.matrix.InteractionMatrix, can splits be retrieved under Scenario.full_training_data, Scenario.validation_training_data, Scenario.validation_data and Scenario.test_data.

Splitters

Splitters are the building blocks for the scenarios. A splitter performs a simple split into two InteractionMatrices according to one, simple criterion, e.g. “Fold in is all interactions before T, hold out all interactions after T”.

Should you want to implement a new scenario that is not yet supported, these splitters facilitate easy implementation.

`UserSplitter`(users_in, users_out)	Split data by the user identifiers of the interactions.
`FractionInteractionSplitter`(in_frac[, seed])	Split data randomly, such that `in_fraction` of interactions are assigned to the first return value and the remainder to the second.
`TimestampSplitter`(t[, delta_out, delta_in])	Split data so that the first return value contains interactions in `[t-delta_in, t[`, and the second those in `[t, t+delta_out[`.
`StrongGeneralizationSplitter`([in_frac, ...])	Randomly splits the users into two sets so that interactions for a user will always occur only in one split.
`UserInteractionTimeSplitter`(t)	Split users based on the time of their most recent interactions.
`MostRecentSplitter`(n)	Splits the n most recent interactions of a user into the second return value, and earlier interactions into the first.