recpack.algorithms.samplers.SequenceMiniBatchPositivesTargetsNegativesSampler
- class recpack.algorithms.samplers.SequenceMiniBatchPositivesTargetsNegativesSampler(num_negatives: int, pad_token: int, batch_size: int = 100)
Samples num_negatives negatives for every positive in a sequence.
This approach allows to learn multiple times from the same positive interactions. Because the sequence-aspect is important here, we only eliminate collisions in the exact same location in the sequence. As a result, a sample that occurs at a later or earlier time in the sequence may be sampled as a negative for all other locations in the sequence.
Handles sequences of unequal length by padding them with pad_token.
- Parameters
num_negatives (int) – Number of negative samples for each positive
pad_token (int) – Token used to indicate that this location in the sequence contains a padding element.
batch_size (int, optional) – The number of sequences returned per batch, defaults to 100
Methods
sample
(X)Sample positives, targets and negatives from the input matrix.
- sample(X: recpack.matrix.interaction_matrix.InteractionMatrix) Iterator[Tuple[torch.LongTensor, torch.LongTensor, torch.LongTensor, torch.LongTensor]]
Sample positives, targets and negatives from the input matrix.
Yields tuples of:
uids: 1D tensor with the user ids in this batch. Shape = (batch_size,)
positives: 2D tensor with row per user, and history item_ids in order on each row. Rows are sorted, such that longest histories are higher in the tensor. Histories shorter than the width of the tensor are filled up with padding tokens. Shape = (batch_size, max_hist_len(batch))
targets: 2D tensor with targets to predict for each user. This is the positives, but rolled 1 position to the left. Such that the target of the first positive is the second positive in the sequence. Each sequence ends with a padding token as target, since there is no knowledge of the next item at the end of the sequence. Shape = (batch size, max_hist_len(batch))
negatives: 3D tensor, with negative examples for each positive. For each positive self.num_negatives negatives are sampled, these negatives are checked against only the target item. Shape = (batch_size, max_hist_len(batch), self.num_negatives)
- Parameters
X (InteractionMatrix) – Interaction matrix to generate samples from.
- Yield
tuples of (uids, positives, targets, negatives)
- Return type
Iterator[ Tuple[torch.LongTensor, torch.LongTensor, torch.LongTensor, torch.LongTensor] ]