RecSys 2023 -- Spiky SVD

Code for our paper "Of Spiky SVDs and Music Recommendation" by D. Afchar, R. Hennequin and V. Guigue, that will be presented at RecSys 2023 (September 18-22nd, 2023).

Dependencies

Our experiments run in Python and rely on a relatively light and standard toolkit in machine learning. Specifically, the pip package versions we have used to run our experiments are as follows:

numpy==1.21.5
scikit-learn==1.0.2
scipy==1.7.3
pandas==1.3.5

Experiments

Data is prepared using ppmi.sql;
SVDs are computed using Scikit-learn's randomized_svd(), our precise script is provided in compute_rsvd.py;
Figure 1's embedding scatterplots are obtained using visualization.py and may be used to have a fine-grained view on the spiking effect;
Table 1 may be reproduced using spk_computation.py;
Stability plots are obtained using stability.py.

Data

We report to each link provided in the paper to get the raw datasets.

To ease reproducibility, we added the precomputed SVDs of the AoTM-2011, Cornell-Yes, LastFM-360K, Movielens-25M datasets. We cannot upload the precomputed SVDs of the two remaining datasets on GitHub due to the upload size limit (each .npy file is above 1GB). We will push said computed decompositions on Zenodo soon.

If not straightforward, scripts for selecting data from each dataset and converting it to a binary matrix are provided in the folder data_extraction.

We report some stats on each dataset:

Dataset	#items	type
AoTM-2011	27.241	track-playlist
Cornell	44.442	track-playlist
LastFM	94.272	artist-user
Movielens	42.490	movie-user
Spotify	1.261.730	track-playlist
Deezer	1.438.603	track-playlist

The number of considered items may be lower than found in the original datasets due to the filtering of items with too few interactions (as described in ppmi.sql).

Contact

[email protected]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RecSys 2023 -- Spiky SVD

Dependencies

Experiments

Data

Contact

About

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data_extraction		data_extraction
svd		svd
.gitignore		.gitignore
README.md		README.md
compute_rsvd.py		compute_rsvd.py
ppmi.sql		ppmi.sql
spk_computation.py		spk_computation.py
stability.py		stability.py
visualization.py		visualization.py

deezer/spiky_svd

Folders and files

Latest commit

History

Repository files navigation

RecSys 2023 -- Spiky SVD

Dependencies

Experiments

Data

Contact

About

Resources

Stars

Watchers

Forks

Languages