Download our tag-hierarchy extracting algorithms:

Both algorithms are aimed at extracting a directed acyclic graph (a hierarchy) between tags from co-occurrence statistics. A detailed description of the algorithms can be found in our paper "Extracting tag hierarchies". They take either the weighted co-occurrence network between the tags as the input, (accompanied by the frequency distribution of the tags), or simply the list of co-occurring tags on the individual objects, (in which case the first step is the preparation of the co-occurrence network). The output is simply the list of directed links between the tags, representing the hierarchy. Both algorithms are implemented in Perl.
The last version of algorithm B has extra features:
  • the algorithm allows more complex Directed Acyclic Graphs (DAGs) to be reconstructed, where the indegree can be larger than 1, that is allowing multiple parents for a node in the hierarchy
  • with an option the reconstructed DAG is enforced to have a single connected component


archived versions are here


Contact
hiertags@hal.elte.hu