Flickr dataset:
We provide a weighted co-occurrence network between Flickr tags together with the corresponding tag frequencies. This network is resulting from the tags (free words) co-appearing on photos in rather long list of queries from Flickr. In the preparation of the network we kept only English nouns and took into account co-occurrences only if they were present on photos from at least 10 different users.
Reference
G. Tibély et al: Extracting tag hierarchies PLoS ONE 8(12): e84133 (2013).
Files:
File name Description Format Size
Flickr_co-occurrence_net.zip Co-occurrence network of tags on Flickr photos compressed plain text file
1st. and 2nd. columns: co-appearing tag ids
3d. column: number of co-occurrences.
5.5Mb
Flickr_tag_frequencies.zip Frequency of Flickr tags compressed plain text file
1st. column: tag id, 2nd column: number of photos.
0.1Mb
Flickr_tag_names.zip Tag names compressed plain text file
1st. column: tag id, 2nd column: name.
1.3Mb
Flickr_dataset.zip All files in the Flickr dataset as a zip archive 6.9Mb
Note:
Each file header contains instructions for processing the data with the Hierarchy Extracting Algorithms
Note2:
We provide a smaller version of the dataset, where you may get results faster

Contact
hiertags@hal.elte.hu