recpack.algorithms.ItemPNN

class recpack.algorithms.ItemPNN(K=200, similarity: str = 'cosine', pop_discount: Optional[float] = None, normalize_X: bool = False, normalize_sim: bool = False, pdf: str = 'empirical', seed: Optional[int] = None)

Item Probabilistic Nearest Neighbours model.

First described in Panagiotis Adamopoulos and Alexander Tuzhilin. 2014. ‘On over-specialization and concentration bias of recommendations: probabilistic neighborhood selection in collaborative filtering systems’. In Proceedings of the 8th ACM Conference on Recommender systems (RecSys ‘14). Association for Computing Machinery, New York, NY, USA, 153–160. DOI:https://doi.org/10.1145/2645710.2645752

For each item K neighbours are selected either uniformly or based on the empirical distribution of the items (or a softmax thereof). Similarity parameter decides how to compute the similarity between two items. Supported options are: "cosine" and "conditional_probability"

  • Cosine similarity between item i and j is computed as the count(i and j) / (count(i)*count(j)).

  • Conditional probablity of item i with j is computed as count(i and j) / (count(i)). Note that this is a non-symmetric similarity measure.

If sim_normalize is True, the scores are normalized per predictive item, making sure the sum of each row in the similarity matrix is 1.

Parameters
  • K (int, optional) – How many neigbours to use per item, make sure to pick a value below the number of columns of the matrix to fit on. Defaults to 200

  • similarity (str, optional) – Which similarity measure to use, can be one of [“cosine”, “conditional_probability”], defaults to “cosine”

  • pop_discount (float, optional) – Power applied to the comparing item in the denominator, to discount contributions of very popular items. Should be between 0 and 1. If None, apply no discounting. Defaults to None.

  • normalize_X (bool, optional) – Normalize rows in the interaction matrix so that the contribution of users who have viewed more items is smaller, defaults to False

  • normalize_sim (bool, optional) – Normalize scores per row in the similarity matrix to counteract artificially large similarity scores when the predictive item is rare, defaults to False.

  • pdf (str, optional) – Which probability distribution to use, can be one of [“empirical”, “uniform”, “softmax_empirical”], defaults to “empirical”

  • seed (int, optional) – Seed to the randomizers, useful for reproducible results, defaults to None

Raises

ValueError – If an unsupported similarity measure or probability distribution is passed.

Methods

fit(X)

Fit the model to the input interaction matrix.

get_metadata_routing()

Get metadata routing of this object.

get_params([deep])

Get parameters for this estimator.

predict(X)

Predicts scores, given the interactions in X

set_params(**params)

Set the parameters of the estimator.

Attributes

SUPPORTED_SAMPLING_FUNCTIONS

The supported similarity options

SUPPORTED_SIMILARITIES

The supported similarity options

identifier

Name of the object.

name

Name of the object's class.

SUPPORTED_SAMPLING_FUNCTIONS = ['empirical', 'uniform', 'softmax_empirical']

The supported similarity options

fit(X: Union[recpack.matrix.interaction_matrix.InteractionMatrix, scipy.sparse._csr.csr_matrix]) recpack.algorithms.base.Algorithm

Fit the model to the input interaction matrix.

After fitting the model will be ready to use for prediction.

This function will handle some generic bookkeeping for each of the child classes,

  • The fit function gets timed, and this will get printed

  • Input data is converted to expected type using call to _transform_predict_input()

  • The model is trained using the _fit() method

  • _check_fit_complete() is called to check fitting was succesful

Parameters

X (Matrix) – The interactions to fit the model on.

Returns

self, fitted algorithm

Return type

Algorithm

get_metadata_routing()

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns

routing – A MetadataRequest encapsulating routing information.

Return type

MetadataRequest

get_params(deep=True)

Get parameters for this estimator.

Parameters

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns

params – Parameter names mapped to their values.

Return type

dict

property identifier

Name of the object.

Name is made by combining the class name with the parameters passed at construction time.

Constructed by recreating the initialisation call. Example: Algorithm(param_1=value)

property name

Name of the object’s class.

predict(X: Union[recpack.matrix.interaction_matrix.InteractionMatrix, scipy.sparse._csr.csr_matrix]) scipy.sparse._csr.csr_matrix

Predicts scores, given the interactions in X

Recommends items for each nonzero user in the X matrix.

This function is a wrapper around the _predict() method, and performs checks on in- and output data to guarantee proper computation.

  • Checks that model is fitted correctly

  • checks the output using _check_prediction() function

Parameters

X (Matrix) – interactions to predict from.

Returns

The recommendation scores in a sparse matrix format.

Return type

csr_matrix

set_params(**params)

Set the parameters of the estimator.

Parameters

params (dict) – Estimator parameters