Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

将movielens-1m数据集转化为原子数据时报错 #110

Closed
cizhouyu opened this issue Nov 10, 2022 · 2 comments
Closed

将movielens-1m数据集转化为原子数据时报错 #110

cizhouyu opened this issue Nov 10, 2022 · 2 comments

Comments

@cizhouyu
Copy link

您好!感谢提供转换数据集的方法。我遇到了一些问题:
使用命令 python run.py --dataset ml-1m --input_path ml-1m --output_path output_data/ml-1m --convert_inter --convert_item --convert_user 时,报错:
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xe9 in position 3114: invalid continuation byte。

操作步骤:按照https://github.com/RUCAIBox/RecSysDatasets/blob/master/conversion_tools/usage/MovieLens.md 指引的操作步骤,到第三步时报错,如下图所示。
error0
请问我要怎么做呢?

@cizhouyu
Copy link
Author

我发现只有在--convert_item时会报错,--convert_inter和--convert_user都没有问题的。请问怎么做可以正确转换item数据呢?

@cizhouyu
Copy link
Author

使用 #94 中,用户 guedes-joaofelipe 提到的方法,已经解决问题。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant