Skip to content

Collection of scripts that convert data from the MovieLens 100k Dataset to JSON files.

License

Notifications You must be signed in to change notification settings

tumut/ml100k-to-json

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MovieLens 100k to JSON

This repository contains a collection of scripts that help convert some of the data from the MovieLens 100k Dataset into JSON files that are easier to handle.

How to use

Each script has a --help (alt. -h) command that should help with using it. Ideally, you'll find that the scripts will be used in the following order:

  • process_ml100k.py - Generates the initial JSON file.
  • correct_title.py - Separates the release year from the movies' titles, putting it in a separate movie_year field.
  • get_imdb.py - Enriches each movie entry with a few IMDb data; requires an internet connection and may take a while.
  • prune.py - Removes movies for which the IMDb data couldn't be retrieved.

Dependencies

License

These scripts are released under the terms of the MIT license.

About

Collection of scripts that convert data from the MovieLens 100k Dataset to JSON files.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published