Disjoint Set Data Structures

The disjoint set data structure, or DSDS, is a collection of different trees. The DSDS is an efficient way to search up whether two nodes are in the same group or not or to keep track of the connected components of an undirected graph. It is efficient because it remembers all operations done on the structure, and makes those exact operations more efficient in the future, giving it an edge over plain brute-force search. Implementations of the DSDS for specific problems commonly require different specifications and changes, but the unimproved DSDS has three methods:

  1. MakeSet(),
  2. Find(),
  3. and Unify().

MakeSet() initializes a tree with only one node and adds it to the DSDS (forest). This method is run before any other method calls. The node of an unimproved DSDS has the following attributes:

  1. node.parent (the ancestor of the node)
  2. (only necessary for optimizations and commonly redundant) node.rank

Initially, node.parent is set to itself (since every node is its own ancestor), and node.rank is set to 0. The redundant node.rank will not be discussed in depth here, see the Wikipedia article on the DSDS for more information.

Find(x) locates the deepest ancestor of a node x (since x.parent.parent might not be equal to x.parent), and sets ALL sub-ancestors of x to the true ancestor of x along the way, speeding up future operations. An unimproved DSDS implementation of find(x) in C++ can be

int find(int v)
    if(father[v]==v) return v; 
    return father[v];

Unify(x, y) merges the trees containing node X and the tree containing node Y by locating the ancestors of X and Y and then binding them together with another node, making the two sub-trees (now one tree) taller in the process. A common (no-rank) implementation in C++ can be:

void unify(int x, int y) 
	int fx,fy; fx=find(x); fy=find(y);
        if (fx==fy) return;


Coming soon!

Uses (Appearances)

NOI 2002: Legend of Galactic Heroes