recpack.datasets.RecsysChallenge2015
- class recpack.datasets.RecsysChallenge2015(path: str = 'data', filename: Optional[str] = None, use_default_filters=True)
- Handles data from the Recsys Challenge 2015, yoochoose dataset. - All information and downloads can be found at https://www.kaggle.com/chadgostopp/recsys-challenge-2015. Because downloading the data requires a Kaggle account we can’t download it here, you should download the data manually and provide the path to the yoochoose-clicks.dat file. - Default processing makes sure that: - Each remaining item has been interacted with by at least 5 users. 
 - Parameters
- path (str, optional) – The path to the data directory. Defaults to data 
- filename (str, optional) – Name of the file, if no name is provided the dataset default will be used if known. 
- use_default_filters (bool, optional) – Should a default set of filters be initialised? Defaults to True 
 
 - Methods - add_filter(_filter[, index])- Add a filter to be applied when loading the data. - fetch_dataset([force])- Check if dataset is present, if not download - load()- Loads data into an InteractionMatrix object. - Attributes - Default filename that will be used if it is not specified by the user. - Name of the column in the DataFrame with item identifiers - Name of the column in the DataFrame that contains time of interaction in seconds since epoch. - Name of the column in the DataFrame that contains user identifiers. - The fully classified path to the file from which dataset will be loaded. - DEFAULT_FILENAME = 'yoochoose-clicks.dat'
- Default filename that will be used if it is not specified by the user. 
 - ITEM_IX = 'item_id'
- Name of the column in the DataFrame with item identifiers 
 - TIMESTAMP_IX = 'seconds_since_epoch'
- Name of the column in the DataFrame that contains time of interaction in seconds since epoch. 
 - USER_IX = 'session'
- Name of the column in the DataFrame that contains user identifiers. 
 - add_filter(_filter: recpack.preprocessing.filters.Filter, index=None)
- Add a filter to be applied when loading the data. - If the index is specified, the filter is inserted at the specified index. Otherwise it is appended. - Parameters
- _filter (Filter) – Filter to be applied to the loaded DataFrame processing to interaction matrix. 
- index (int) – The index to insert the filter at, None will append the filter. Defaults to None 
 
 
 - fetch_dataset(force=False)
- Check if dataset is present, if not download - Parameters
- force (bool, optional) – If True, dataset will be downloaded, even if the file already exists. Defaults to False. 
 
 - property file_path
- The fully classified path to the file from which dataset will be loaded. 
 - load() recpack.matrix.interaction_matrix.InteractionMatrix
- Loads data into an InteractionMatrix object. - Data is loaded into a DataFrame using the _load_dataframe function. Resulting DataFrame is parsed into an InteractionMatrix object. During parsing the filters are applied in order. - Returns
- The resulting InteractionMatrix 
- Return type