Hierarchical data access in MySQL using nested sets

April 01, 2008 Development

As you may know, MySQL is a relational database system. It consists of flat tables, which can be joined together in queries. Relations between these tables can only be specified in a way that is one-to-one/one-to-many.

This suits most situations, but when you start getting to hierarchical data, such as multiple level categories (as used on this site), these types of relations start to become non-optimal.

Take for instance the adjacency list method. Each entry in the table is given a column with the id of its parent node:

id	name	parent_id
1	parent	0
2	child 1	1
3	sub-child 1	2

In order to find all descendants of a node, there would need to be a query once for the parent, then once per node:

function getNodes($db, $parentId) {
    $sql = 'SELECT * FROM nodes WHERE parent_id = ?';
    $nodes = array()
    foreach ($db->fetchAll($sql, array($parentId), Zend_Db::FETCH_OBJ) as $node) {
        $node->subnodes = getNodes($db, $node->id);
        $nodes[] = $node;
    }
    return $nodes;
}

An alternative method is to use nested sets. This takes advantage of MySQL’s index range selection by using two columns to specify a range in which children exist:

id	name	left	right
1	parent	1	6
2	child 1	2	5
3	subchild 1	3	4

In this case all descendants of a node can be found between the parent node’s left and right columns.

Combining this with the adjacent list method, you can optimise the conversion of a MySQL result set into a multi-level array of objects:

function getNodes($db, $parentId) {
    $root = (object) array(
        'id' => $parentId,
    );
 
    $stack = array($root);
 
    $sql = 'SELECT  nodes.*
FROM nodes, nodes AS parent
WHERE nodes.left > parent.left AND nodes.left < parent.right
AND parent.id = ?
ORDER BY nodes.left';
 
    foreach ($db->fetchAll($sql, array($parentId), Zend_Db::FETCH_OBJ) as $node) {
 
        while ($node->parent_id != $stack[count($stack)-1]->id) {
            array_pop($stack);
        }
 
        $stack[count($stack)-1]->subnodes[] = $node;
        array_push($stack, $node);
    }
    return $root->subnodes;
}

Of course this method is at the expense of insert/modification of nodes in the table, which would need more code to do, but usually these happen much less often than selects in production environments.

Managing Hierarchical Data in MySQL

MySQL
PHP