recpack.metrics

The metrics module in recpack contains a large amount of metrics commonly used to evaluate recommendation algorithms.

All metrics assume that we have access to a set y_true of true user interactions that we are trying to predict and a set of recommendation scores y_pred. We can then evaluate how well our algorithm was able to predict these interactions in y_true.

Most metrics are “Top-K Metrics”: they consider only the Top-K best scoring item predictions, as the number of recommendations that can be shown in a realistic setting is limited.

Global Metrics

A global metric reports only a single, global metric value.

CoverageK(K)

Fraction of all items that are ranked among the Top-K recommendations for any user.

PercentileRanking()

Expected Percentile Ranking.

Listwise Metrics

A listwise metric reports one metric value for every user. To obtain a global metric value, these per-user scores are averaged.

DCGK(K)

Computes the sum of gains of all items in a recommendation list.

NDCGK(K)

Computes the normalized sum of gains of all items in a recommendation list.

RecallK(K)

Computes the fraction of true interactions that made it into the Top-K recommendations.

CalibratedRecallK(K)

Computes number of Top-K recommendations that were hits divided by the minimum of K and number of true interactions of the user.

PrecisionK(K)

Computes the fraction of top-K recommendations that correspond to true interactions.

ReciprocalRankK(K)

Computes the inverse of the rank of the first hit in the recommendation list.

Elementwise Metric

An elementwise metric reports a score for every user-item pair in the Top-K. To obtain a global metric value, these scores are summed per user, then averaged.

HitK(K)

Computes the number of hits in a list of Top-K recommendations.

DiscountedGainK(K)

Computes the discounted gain of every item in the Top-K recommendations of a user.