Welcome to HIERTAGS, a site dedicated to tag hierarchy extraction.

Tags appear rather frequently on various on line platforms like blogs, photo sharing portals, news feeds, etc. In most cases they correspond to free words, chosen by independent people, thus, the resulting organization of the tags is "flat", (all tags are equal), opposed to traditional hierarchical categorization. Nevertheless, the way users think about the tagged objects has some built in hierarchy, e.g., "poodle" is usually considered as a special case of "dog". An interesting challenge related to this problem is the extraction of this sort of hierarchy between the tags based on statistical properties as tag frequencies, co-occurrences, etc.

On this site we provide algorithms extracting a tag hierarchy from the weighted co-occurrence network between the tags, (where the weight of a link corresponds to the number of shared objects). Furthermore, we also offer tagged datasets for testing tag hierarchy extracting algorithms in general.