recpack.algorithms.samplers.PositiveNegativeSampler

class recpack.algorithms.samplers.PositiveNegativeSampler(num_negatives=1, batch_size=100, replace=True, exact=False, distribution='uniform')

Samples linked positive and negative interactions for users.

Provides a sample() method that samples positives and negatives. Positives are sampled uniformly from all positive interactions. Negative samples are sampled either based on a uniform distribution or a unigram distribution.

The uniform distrbution makes it so each item has the same probability to be selected as negative. With the unigram distribution, items are sampled according to their weighted popularity.

\[P(w_i) = \frac{ {f(w_i)}^{3/4} }{\sum_{j=0}^{n}\left( {f(w_j)}^{3/4} \right) }\]

Parameters

num_negatives (int, optional) – Number of negative samples for each positive, defaults to 1
batch_size (int, optional) – The number of samples returned per batch, defaults to 100
replace (bool, optional) – Sample positives with or without replacement. Defaults to True
exact (bool, optional) – If False (default) negatives are checked agains the corresponding positive sample only, allowing for (rare) collisions. If collisions should be avoided at all costs, use exact = True, but suffer decreased performance.
distribution (string, optional) – The distribution used to sample negative items, defaults to uniform. Options are ‘uniform’ and ‘unigram’

Methods

sample(X[, sample_size, positives])

Sample num_negatives negatives for each sampled user-item-pair (positive).

Attributes

ALLOWED_DISTRIBUTIONS

sample(X: scipy.sparse._csr.csr_matrix, sample_size=None, positives=None) → Iterator[Tuple[torch.LongTensor, torch.LongTensor, torch.LongTensor]]

Sample num_negatives negatives for each sampled user-item-pair (positive).

When sampling without replacement, sample_size cannot exceed the number of positives in X.

Parameters

X (csr_matrix) – Matrix with interactions to sample from.
sample_size (int, optional) – The number of samples to create, if None, the number of positives entries in X will be used. Defaults to None.
positives (np.array, optional) – Restrict positives samples to only samples in this np.array of dimension (num_samples, 2).

Raises

ValueError – [description]

Yield

Iterator of (user_batch, positive_samples_batch, negative_samples_batch)

Return type

Iterator[Tuple[torch.LongTensor, torch.LongTensor, torch.LongTensor]]