recpack.algorithms.ItemKNN
- class recpack.algorithms.ItemKNN(K=200, similarity: str = 'cosine', pop_discount: Optional[float] = None, normalize_X: bool = False, normalize_sim: bool = False)
Item K Nearest Neighbours model.
First described in ‘Item-based top-n recommendation algorithms.’ Deshpande, Mukund, and George Karypis, ACM Transactions on Information Systems (TOIS) 22.1 (2004): 143-177
For each item the K most similar items are computed during fit. Similarity parameter decides how to compute the similarity between two items. Supported options are:
"cosine"
and"conditional_probability"
Cosine similarity between item i and j is computed as
\[sim(i,j) = \frac{X_i X_j}{||X_i||_2 ||X_j||_2}\]The conditional probablity based similarity of item i with j is computed as
\[sim(i,j) = \frac{\sum\limits_{u \in U} \mathbb{I}_{u,i} X_{u,j}}{Freq(i) \times Freq(j)^{\alpha}}\]Where I_ui is 1 if the user u has visited item i, and 0 otherwise. And alpha is the pop_discount parameter. Note that this is a non-symmetric similarity measure. Given that X is a binary matrix, and alpha is set to 0, this simplifies to pure conditional probability.
\[sim(i,j) = \frac{Freq(i \land j)}{Freq(i)}\]If sim_normalize is True, the scores are normalized per predictive item, making sure the sum of each row in the similarity matrix is 1.
- Parameters
K (int, optional) – How many neigbours to use per item, make sure to pick a value below the number of columns of the matrix to fit on. Defaults to 200
similarity (str, optional) – Which similarity measure to use, can be one of [“cosine”, “conditional_probability”], defaults to “cosine”
pop_discount (float, optional) – Power applied to the comparing item in the denominator, to discount contributions of very popular items. Should be between 0 and 1. If None, apply no discounting. Defaults to None.
normalize_X (bool, optional) – Normalize rows in the interaction matrix so that the contribution of users who have viewed more items is smaller, defaults to False
normalize_sim (bool, optional) – Normalize scores per row in the similarity matrix to counteract artificially large similarity scores when the predictive item is rare, defaults to False.
- Raises
ValueError – If an unsupported similarity measure is passed.
Methods
fit
(X)Fit the model to the input interaction matrix.
Get metadata routing of this object.
get_params
([deep])Get parameters for this estimator.
predict
(X)Predicts scores, given the interactions in X
set_params
(**params)Set the parameters of the estimator.
Attributes
The supported similarity options
Name of the object.
Name of the object's class.
- SUPPORTED_SIMILARITIES = ['cosine', 'conditional_probability']
The supported similarity options
- fit(X: Union[recpack.matrix.interaction_matrix.InteractionMatrix, scipy.sparse._csr.csr_matrix]) recpack.algorithms.base.Algorithm
Fit the model to the input interaction matrix.
After fitting the model will be ready to use for prediction.
This function will handle some generic bookkeeping for each of the child classes,
The fit function gets timed, and this will get printed
Input data is converted to expected type using call to
_transform_predict_input()
The model is trained using the
_fit()
method_check_fit_complete()
is called to check fitting was succesful
- Parameters
X (Matrix) – The interactions to fit the model on.
- Returns
self, fitted algorithm
- Return type
- get_metadata_routing()
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns
routing – A
MetadataRequest
encapsulating routing information.- Return type
MetadataRequest
- get_params(deep=True)
Get parameters for this estimator.
- Parameters
deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns
params – Parameter names mapped to their values.
- Return type
dict
- property identifier
Name of the object.
Name is made by combining the class name with the parameters passed at construction time.
Constructed by recreating the initialisation call. Example:
Algorithm(param_1=value)
- property name
Name of the object’s class.
- predict(X: Union[recpack.matrix.interaction_matrix.InteractionMatrix, scipy.sparse._csr.csr_matrix]) scipy.sparse._csr.csr_matrix
Predicts scores, given the interactions in X
Recommends items for each nonzero user in the X matrix.
This function is a wrapper around the
_predict()
method, and performs checks on in- and output data to guarantee proper computation.Checks that model is fitted correctly
checks the output using
_check_prediction()
function
- Parameters
X (Matrix) – interactions to predict from.
- Returns
The recommendation scores in a sparse matrix format.
- Return type
csr_matrix
- set_params(**params)
Set the parameters of the estimator.
- Parameters
params (dict) – Estimator parameters