Tuesday, July 12, 2016

[Algorithm] Lowest Common Ancestor of a Binary Tree

Lowest Common Ancestor of a Binary Tree

Problem Statement:

Given a binary tree, find the lowest common ancestor (LCA) of two given nodes in the tree.

According to the definition of LCA on Wikipedia: “The lowest common ancestor is defined between two nodes v and w as the lowest node in T that has both v and w as descendants (where we allow a node to be a descendant of itself).”

        _______3______
       /              \
    ___5__          ___1__
   /      \        /      \
   6      _2       0       8
         /  \
         7   4
For example, the lowest common ancestor (LCA) of nodes 5 and 1 is 3. Another example is LCA of nodes 5 and 4 is 5, since a node can be a descendant of itself according to the LCA definition.


Solution 1: DFS based solution

Time complexity: O(n) for one query, O(mn) for m queries

The basic idea of this approach is to search for the two nodes in a DFS fashion. The two different nodes may have different path from root to itself and at some point these two paths will overlap. One native approach will be recording the two paths and compare them to find the last common node.

A better way is to check that on the fly while we are DFS traversing the tree. Instead of keeping the record of every node in the path leading to each one of the two, we actually just need to know whether anyone of the two nodes is in the path or not. 

Using the tree example in the problem statement, suppose we are to find LCA of node 6 and 4. The path leading to 4 is 3->5->2->4 and the path leading to 6 is 3->5->6. When we are traversing in DFS fashion and at node 5, we can easily find out if 6 and 4 exists in its left and right sub-tree. If they exist in left and right sub-trees respectively like at node 5, node 5 is the LCA. If for example we are at node 2, we know that 4 exists in its right sub-tree but 6 doesn't in neither of left or right, we just return back the found node to its parent node. (Not sure if it's clear enough, I will try to come up with more illustration later)

Code Implementation:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
/**
 * Definition for a binary tree node.
 * struct TreeNode {
 *     int val;
 *     TreeNode *left;
 *     TreeNode *right;
 *     TreeNode(int x) : val(x), left(NULL), right(NULL) {}
 * };
 */
class Solution {
public:
    TreeNode* lowestCommonAncestor(TreeNode* root, TreeNode* p, TreeNode* q) {
        return dfsHelper(root, p, q);
    }
    
    TreeNode* dfsHelper(TreeNode* root, TreeNode* p, TreeNode* q) {
        if (root == NULL || root == p || root == q) return root;
        
        TreeNode* left = dfsHelper(root->left, p, q);
        TreeNode* right = dfsHelper(root->right, p, q);
        
        if (left == NULL && right == NULL) {
            return NULL;
        } else if (left && right) {
            return root;
        }
        
        return left ? left : right;
    }
};


Solution 2: Levelization based solution

Time complexity: O(n + logn) for one query, O(n + mlogn) for m queries if tree is immutable, O(m(n + logn)) for m queries if tree is mutable.

Solution 1 is actually good enough in terms of runtime and simplicity. Actually there's another approach which was actually my first solution for this problem, however it's kind of lengthy than the solution 1, but I think it is actually also a very good one.

This approach is centered around tree levelization, namely we are going to levelize each node with a marking of level number starting from 0 at the root. One can easily do this by a BFS traversal. Besides keeping a level number we also keep the parent node for each of the node we visited (This may not be necessary, but for efficiency we better keep it).

Then the algorithm is very simple, the following describes how we find LCA:

1) First we find out the levels of each node and their parent node, keep them in a hash table.

2) Given the two nodes, we lookup their level in the hash table.

3a) If they are of same level, trace back the two node simultaneously until they merge at one parent

3b) If they are not at same level, trace back the node with larger node level until the level is the same and then do 3a) step

This algorithm runs slower than solution 1 in the testcases of LeetCode OJ because it performs worse if the tree is mutable.

If the tree is immutable each lookup is only in O(logn) order. And another upside of this approach is that the concept of levelization can be used in directed acyclic graph (DAG) to find the LCA. (This algorithm is indeed used in some of the industrial software as well)

Code implementation:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
/**
 * Definition for a binary tree node.
 * struct TreeNode {
 *     int val;
 *     TreeNode *left;
 *     TreeNode *right;
 *     TreeNode(int x) : val(x), left(NULL), right(NULL) {}
 * };
 */
class Solution {
public:
    TreeNode* lowestCommonAncestor(TreeNode* root, TreeNode* p, TreeNode* q) {
        if (root == NULL) return NULL;
        p_level = -1;
        q_level = -1;
        parent_info.clear();
        levelizeTree(root, p, q);
        
        if (q_level != -1 && p_level != -1) {
            if (q_level > p_level) {
                q = goLevelUp(q, (q_level - p_level));
            } else if (q_level < p_level) {
                p = goLevelUp(p, (p_level - q_level));
            }
            
            while (q != p) {
                q = parent_info[q];
                p = parent_info[p];
            }
            
            return q;
        } else {
            return NULL;
        }
    }
    
    inline void levelizeTree(TreeNode* root, TreeNode* p, TreeNode* q) {
        vector<TreeNode*> currlvl;
        int level = 0;
        currlvl.push_back(root);
        parent_info[root] = NULL;
        
        while(!currlvl.empty()) {
            vector<TreeNode*> nextlvl;
            for (int i=0; i<currlvl.size(); i++) {
                TreeNode* node = currlvl[i];
                
                if (node->left) {
                    parent_info[node->left] = node;
                    nextlvl.push_back(node->left);
                }
                
                if (node->right) {
                    parent_info[node->right] = node;
                    nextlvl.push_back(node->right);
                }
                
                if (p == node) p_level = level;
                if (q == node) q_level = level;
            }
            
            currlvl = nextlvl;
            level++;
        }
    }
    
    inline TreeNode* goLevelUp(TreeNode* node, int level) {
        while (level > 0 && node != NULL) {
            node = parent_info[node];
            level--;
        }
        return node;
    }
    
    int p_level;
    int q_level;
    std::unordered_map<TreeNode*, TreeNode*> parent_info;
};




No comments:

Post a Comment