IMDb dataset:
A filtered data set listing most frequent keywords associated with movies in the IMDb. We kept keywords appearing at least on 100 different movies in the original data , downloaded from here
G. Tibély et al: Extracting tag hierarchies PLoS ONE 8(12): e84133 (2013).
File name Description Format Size List of movies, where each row corresponds to a movie plain text file
1st. column: movie title id
rest of the columns: keyword ids.
9.9Mb Tag id names compressed plain text file
1st. column: tag id
2nd columns: name.
1.2Mb All files in the IMDb dataset as a zip archive 6.3Mb
Each file header contains instructions for processing the data with the Hierarchy Extracting Algorithms

