recpack.pipelines.PipelineBuilder
- class recpack.pipelines.PipelineBuilder(folder_name: Optional[str] = None, base_path: Optional[str] = None)
Builder to facilitate construction of pipelines.
The builder contains functions to set specific values for the pipeline. Save and Load make it possible to easily recreate pipelines.
To disable history filtering in the pipeline, set the
remove_history
attribute to False.:pipeline_builder.remove_history = False
- Parameters
folder_name (str, optional) – The name of the folder where pipeline information will be stored. If no name is specified, the timestamp of creation is used.
base_path (str, optional) – The base_path to store pipeline in, defaults to the current working directory.
Methods
add_algorithm
(algorithm[, grid, params, ...])Add an algorithm to use in the pipeline.
add_metric
(metric[, K])Register a metric to evaluate
add_post_filter
(filter)Add a filter which will be applied
build
()Construct a pipeline object, given the set values.
set_data_from_scenario
(scenario)Set the train, validation and test data based by extracting them from the scenario.
set_full_training_data
(train_data)Set the full_training dataset.
set_optimisation_metric
(metric, K[, minimise])Set the metric for optimisation of parameters in algorithms.
set_test_data
(test_data)Set the test datasets.
set_validation_data
(validation_data)Set the validation datasets.
set_validation_training_data
(train_data)Set the validation training dataset.
Attributes
True
to enable removal of a user's previous interactions,`False`
to disable.- add_algorithm(algorithm: Union[str, type], grid: Optional[Dict[str, List]] = None, params: Optional[Dict[str, Any]] = None, optimisation_info: Optional[recpack.pipelines.hyperparameter_optimisation.OptimisationInfo] = None)
Add an algorithm to use in the pipeline.
If the algorithm is not implemented by default in recpack, you should register it in the
ALGORITHM_REGISTRY
- Parameters
algorithm (Union[str, type]) – Algorithm class name or type of the algorithm to add.
grid (Dict[str, List], optional) – [DEPRECATED] Parameters to optimise, the dict will be turned into a grid such that each combination of values is used. Defaults to None
params (Dict[str, Any], optional) – The fixed parameters for running the algorithm, represented as a key-value dictionary. Defaults to None
optimisation_info (OptimisationInfo) – Optimisation info, contains information for the optimiser to define the parameter space.
- Raises
ValueError – If the passed algorithm can’t be resolved to a key in the
ALGORITHM_REGISTRY
.
- add_metric(metric: Union[str, type], K: Optional[Union[List, int]] = None)
Register a metric to evaluate
- Parameters
metric (Union[str, type]) – Metric name or type.
K (Optional[Union[List, int]], optional) – The K value(s) used to construct metrics. If it is a list, for each value a metric is added.
- Raises
ValueError – If metric can’t be resolved to a key in the
METRIC_REGISTRY
.
- add_post_filter(filter: recpack.postprocessing.filters.PostFilter) None
- Add a filter which will be applied
on the recommendation scores before prediction.
- Parameters
filter (PostFilter) – Filter to apply, cannot be of type RemoveHistory
- build() recpack.pipelines.pipeline.Pipeline
Construct a pipeline object, given the set values.
If required fields are not set, raises an error.
- Returns
The constructed pipeline.
- Return type
- property remove_history
True
to enable removal of a user’s previous interactions,`False`
to disable. Defaults toTrue
.
- set_data_from_scenario(scenario: recpack.scenarios.scenario_base.Scenario)
Set the train, validation and test data based by extracting them from the scenario.
- set_full_training_data(train_data: recpack.matrix.interaction_matrix.InteractionMatrix)
Set the full_training dataset. This dataset is used for the final training before evaluation on the test dataset.
- Parameters
train_data (InteractionMatrix) – The interaction matrix to use for training.
- set_optimisation_metric(metric: Union[str, type], K: int, minimise=False)
Set the metric for optimisation of parameters in algorithms.
If the metric is not implemented by default in recpack, you should register it in the
METRIC_REGISTRY
- Parameters
metric (Union[str, type]) – metric name or metric type
K (int) – The K value for the metric
minimise (bool, optional) – If True minimal value for metric is better, defaults to False
- Raises
ValueError – If metric can’t be resolved to a key in the
METRIC_REGISTRY
.
- set_test_data(test_data: Tuple[recpack.matrix.interaction_matrix.InteractionMatrix, recpack.matrix.interaction_matrix.InteractionMatrix])
Set the test datasets.
Test data should be a tuple of InteractionMatrices.
- Parameters
test_data (Tuple[InteractionMatrix, InteractionMatrix]) – The tuple of test data, as (test_in, test_out) tuple.
- Raises
ValueError – If tuple does not contain two InteractionMatrices.
- set_validation_data(validation_data: Tuple[recpack.matrix.interaction_matrix.InteractionMatrix, recpack.matrix.interaction_matrix.InteractionMatrix])
Set the validation datasets.
Validation data should be a tuple of InteractionMatrices.
- Parameters
validation_data (Tuple[InteractionMatrix, InteractionMatrix]) – The tuple of validation data, as (validation_in, validation_out) tuple.
- Raises
ValueError – If tuple does not contain two InteractionMatrices.
- set_validation_training_data(train_data: recpack.matrix.interaction_matrix.InteractionMatrix)
Set the validation training dataset. This dataset is used for training models during parameter optimisation, or for incrementally trained models.
- Parameters
train_data (InteractionMatrix) – The interaction matrix to use for training.