IMDb dataset:
A filtered data set listing most frequent keywords associated with movies in the IMDb. We kept keywords appearing at least on 100 different movies in the
original data
, downloaded from
here
Reference
G. Tibély et al:
Extracting tag hierarchies
PLoS ONE 8(12): e84133 (2013).
Files:
File name | Description | Format | Size |
---|---|---|---|
List_of_movies.zip | List of movies, where each row corresponds to a movie | plain text file 1st. column: movie title id rest of the columns: keyword ids. |
9.9Mb |
Movie_keyword_names.zip | Tag id names | compressed plain text file 1st. column: tag id 2nd columns: name. |
1.2Mb |
imdb_dataset.zip | All files in the IMDb dataset as a zip archive | 6.3Mb |
---|
Note:
Each file header contains instructions for processing the data with the
Hierarchy Extracting Algorithms
IMDb
( www.imdb.com ).
For further information contact: licensing@imdb.com