GitHub - a-turcu/RS-LightGCN: Project for the Recommender Systems course at the University of Amsterdam 2023.

Sampling Strategies for Enhanced Recommendation Performance: Advancements in LightGCN

This repository has the goal to reproducing the results from the following paper:

SIGIR 2020. Xiangnan He, Kuan Deng ,Xiang Wang, Yan Li, Yongdong Zhang, Meng Wang(2020). LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation, Paper in arXiv.

Authors: Alexandru Turcu, Bogdan Palfi, and Ryan Amaudruz

In additon, we investigate if sampling strategies can improve the model prediction performance.

Abstract

Graph Convolutional Networks (GCNs) have seen increased popularity in the context of collaborative filtering based Recommender Systems. LightGCN is a heavily simplified variant of GCN, specifically designed for recommendation tasks. While at the time of publishing, LightGCN was able to achieve state-of-the-art performance, there was still room for improvement in the sampling mechanism of the loss function. This paper replicates the results of the original LightGCN paper and tests the model on a new dataset to further demonstrate its robustness. Furthermore, additional sampling methods are developed and tested on all datasets, some of which show increased performance in terms of recall or catalog coverage.

Enviroment Requirement

conda env create -f environment.yml

Dataset

There are 3 original datasets from the original repo: Gowalla, Yelp2018 and Amazon-book. We added a new dataset: LastFM.

see more in dataloader.py

An example to run a 3-layer LightGCN

run LightGCN on Gowalla dataset:

command

cd code && python main.py --decay=1e-4 --lr=0.001 --layer=3 --seed=2020 --dataset="gowalla" --topks="[20]" --recdim=64

Summary of the changes made to the original repo

File	Changes
dataloader.py	Refactored the code to improve performance and increase flexibility
Initial data exploration.ipynb	New file analyzing the datasets
Item analysis.ipynb	New file analyzing of model performance by item popularity
main.py	Refactored the code for new functionality
model.py	Mostly unchanged
parse.py	Mostly unchanged
plotting.py	New file containing plotting functionality
procedure.py	Significant refactoring to manage sampling strategies
utils.py	Significant refactoring to manage sampling strategies
world.py	Minor changes
data	New dataset LastFM added
results	Contains the test set performance
checkpoints	Contains the best checkpoint from each model run

Detailed comments and docstrings were added to the entire repository.

Name		Name	Last commit message	Last commit date
Latest commit History 67 Commits
code		code
data		data
imgs		imgs
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml
report.pdf		report.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sampling Strategies for Enhanced Recommendation Performance: Advancements in LightGCN

Abstract

Enviroment Requirement

Dataset

An example to run a 3-layer LightGCN

Summary of the changes made to the original repo

About

Releases

Packages

Contributors 4

Languages

a-turcu/RS-LightGCN

Folders and files

Latest commit

History

Repository files navigation

Sampling Strategies for Enhanced Recommendation Performance: Advancements in LightGCN

Abstract

Enviroment Requirement

Dataset

An example to run a 3-layer LightGCN

Summary of the changes made to the original repo

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages