recpack.algorithms.samplers.PositiveNegativeSampler
- class recpack.algorithms.samplers.PositiveNegativeSampler(num_negatives=1, batch_size=100, replace=True, exact=False, distribution='uniform')
Samples linked positive and negative interactions for users.
Provides a
sample()
method that samples positives and negatives. Positives are sampled uniformly from all positive interactions. Negative samples are sampled either based on a uniform distribution or a unigram distribution.The uniform distrbution makes it so each item has the same probability to be selected as negative. With the unigram distribution, items are sampled according to their weighted popularity.
\[P(w_i) = \frac{ {f(w_i)}^{3/4} }{\sum_{j=0}^{n}\left( {f(w_j)}^{3/4} \right) }\]- Parameters
num_negatives (int, optional) – Number of negative samples for each positive, defaults to 1
batch_size (int, optional) – The number of samples returned per batch, defaults to 100
replace (bool, optional) – Sample positives with or without replacement. Defaults to True
exact (bool, optional) – If False (default) negatives are checked agains the corresponding positive sample only, allowing for (rare) collisions. If collisions should be avoided at all costs, use exact = True, but suffer decreased performance.
distribution (string, optional) – The distribution used to sample negative items, defaults to uniform. Options are ‘uniform’ and ‘unigram’
Methods
sample
(X[, sample_size, positives])Sample num_negatives negatives for each sampled user-item-pair (positive).
Attributes
ALLOWED_DISTRIBUTIONS
- sample(X: scipy.sparse._csr.csr_matrix, sample_size=None, positives=None) Iterator[Tuple[torch.LongTensor, torch.LongTensor, torch.LongTensor]]
Sample num_negatives negatives for each sampled user-item-pair (positive).
When sampling without replacement,
sample_size
cannot exceed the number of positives in X.- Parameters
X (csr_matrix) – Matrix with interactions to sample from.
sample_size (int, optional) – The number of samples to create, if None, the number of positives entries in X will be used. Defaults to None.
positives (np.array, optional) – Restrict positives samples to only samples in this np.array of dimension (num_samples, 2).
- Raises
ValueError – [description]
- Yield
Iterator of (user_batch, positive_samples_batch, negative_samples_batch)
- Return type
Iterator[Tuple[torch.LongTensor, torch.LongTensor, torch.LongTensor]]