Hierarchical data access in MySQL using nested sets

As you may know, MySQL is a relational database system. It consists of flat tables, which can be joined together in queries. Relations between these tables can only be specified in a way that is one-to-one/one-to-many.

This suits most situations, but when you start getting to hierarchical data, such as multiple level categories (as used on this site), these types of relations start to become non-optimal.

Take for instance the adjacency list method. Each entry in the table is given a column with the id of its parent node:

id name parent_id
1 parent 0
2 child 1 1
3 sub-child 1 2

In order to find all descendants of a node, there would need to be a query once for the parent, then once per node:

function getNodes($db, $parentId) {
    $sql = 'SELECT * FROM nodes WHERE parent_id = ?';
    $nodes = array()
    foreach ($db->fetchAll($sql, array($parentId), Zend_Db::FETCH_OBJ) as $node) {
        $node->subnodes = getNodes($db, $node->id);
        $nodes[] = $node;
    }
    return $nodes;
}

An alternative method is to use nested sets. This takes advantage of MySQL’s index range selection by using two columns to specify a range in which children exist:

id name left right
1 parent 1 6
2 child 1 2 5
3 subchild 1 3 4

In this case all descendants of a node can be found between the parent node’s left and right columns.

Combining this with the adjacent list method, you can optimise the conversion of a MySQL result set into a multi-level array of objects:

function getNodes($db, $parentId) {
    $root = (object) array(
        'id' => $parentId,
    );
 
    $stack = array($root);
 
    $sql = 'SELECT  nodes.*
FROM nodes, nodes AS parent
WHERE nodes.left > parent.left AND nodes.left < parent.right
AND parent.id = ?
ORDER BY nodes.left';
 
    foreach ($db->fetchAll($sql, array($parentId), Zend_Db::FETCH_OBJ) as $node) {
 
        while ($node->parent_id != $stack[count($stack)-1]->id) {
            array_pop($stack);
        }
 
        $stack[count($stack)-1]->subnodes[] = $node;
        array_push($stack, $node);
    }
    return $root->subnodes;
}

Of course this method is at the expense of insert/modification of nodes in the table, which would need more code to do, but usually these happen much less often than selects in production environments.