Search


Software Design Using C++



AVL Trees



The Concept


These are self-adjusting, height-balanced binary search trees and are named after the inventors: Adelson-Velskii and Landis. A balanced binary search tree has Theta(lg n) height and hence Theta(lg n) worst case lookup and insertion times. However, ordinary binary search trees have a bad worst case. When sorted data is inserted, the binary search tree is very unbalanced, essentially more of a linear list, with Theta(n) height and thus Theta(n) worst case insertion and lookup times. AVL trees overcome this problem.

Definitions


The height of a binary tree is the maximum path length from the root to a leaf. A single-node binary tree has height 0, and an empty binary tree has height -1. As another example, the following binary tree has height 3.

                     7

                  /     \

                3         12

              /         /    \

             2        10      20

                     /  \

                    9    11

An AVL tree is a binary search tree in which every node is height balanced, that is, the difference in the heights of its two subtrees is at most 1. The balance factor of a node is the height of its right subtree minus the height of its left subtree. An equivalent definition, then, for an AVL tree is that it is a binary search tree in which each node has a balance factor of -1, 0, or +1. Note that a balance factor of -1 means that the subtree is left-heavy, and a balance factor of +1 means that the subtree is right-heavy. For example, in the following AVL tree, note that the root node with balance factor +1 has a right subtree of height 1 more than the height of the left subtree. (The balance factors are shown at the top of each node.)

                                    +1
                                    30

                                /        \

                             -1            0
                             22            62

                           /             /     \

                         0             +1       -1
                         5             44       95

                                         \     /
                                          0   0
                                          51  77

The idea is that an AVL tree is close to being completely balanced. Hence it should have Theta(lg n) height (it does - always) and so have Theta(lg n) worst case insertion and lookup times. An AVL tree does not have a bad worst case, like a binary search tree which can become very unbalanced and give Theta(n) worst case lookup and insertion times. The following binary search tree is not an AVL tree. Notice the balance factor of -2 at node 70.

                                    -1
                                    100

                                /         \

                             -2             -1
                             70             150
                                      
                          /    \          /     \

                       +1       0       +1        0
                       30      80      130       180

                     /    \               \     
                    0      -1              0   
                   10      40              140
                          /
                         0
                        36 

Inserting a New Item


Initially, a new item is inserted just as in a binary search tree. Note that the item always goes into a new leaf. The tree is then readjusted as needed in order to maintain it as an AVL tree. There are three main cases to consider when inserting a new node.

Case 1:


A node with balance factor 0 changes to +1 or -1 when a new node is inserted below it. No change is needed at this node. Consider the following example. Note that after an insertion one only needs to check the balances along the path from the new leaf to the root.

                                     0
                                    40

                                /        \

                             +1            0
                             20            50

                               \         /     \

                                0       0        0
                                30     45       70

After inserting 60 we get:

                                    +1
                                    40

                                /        \

                             +1            +1
                             20            50

                               \         /     \

                                0       0       -1
                                30     45       70
                                               /
                                              0
                                              60

Case 2:


A node with balance factor -1 changes to 0 when a new node is inserted in its right subtree. (Similarly for +1 changing to 0 when inserting in the left subtree.) No change is needed at this node. Consider the following example.

                                    -1
                                    40

                                /        \

                             +1            0
                             20            50

                           /   \         /     \

                          0     0       0        0
                         10     30     45       70
                               /  \
                              0    0
                             22    32

After inserting 60 we get:

                                     0 <-- the -1 changed to a 0 (case 2)
                                    40

                                /        \

                             +1            +1 <-- an example of case 1
                             20            50

                           /   \         /     \

                          0     0       0       -1 <-- an example of case 1
                         10     30     45       70
                               /  \            /
                              0    0          0
                             22    32         60

Case 3:


A node with balance factor -1 changes to -2 when a new node is inserted in its left subtree. (Similarly for +1 changing to +2 when inserting in the right subtree.) Change is needed at this node. The tree is restored to an AVL tree by using a rotation.

Subcase A:

This consists of the following situation, where P denotes the parent of the subtree being examined, LC is P's left child, and X is the new node added. Note that inserting X makes P have a balance factor of -2 and LC have a balance factor of -1. The -2 must be fixed. This is accomplished by doing a right rotation at P. Note that rotations do not mess up the order of the nodes given in an inorder traversal. This is very important since it means that we still have a legitimate binary search tree. (Note, too, that the mirror image situation is also included under subcase A.)

                              (rest of tree)
                                     |
                                    -2
                                     P

                                /        \

                             -1           sub
                             LC           tree   
                                           of
                           /   \         height
                                            n
                        sub     sub
                        tree    tree      
                         of      of
                       height  height
                          n       n
                         /
                        X

The fix is to use a single right rotation at node P. (In the mirror image case a single left rotation is used at P.) This gives the following picture.

                              (rest of tree)
                                     |
                                     0
                                    LC

                                /        \

                             sub           P
                             tree         
                              of         /    \ 
                            height       
                              n       sub     sub  
                             /        tree    tree
                            X          of      of
                                     height  height
                                        n       n

Consider the following more detailed example that illustrates subcase A.

                                    -1 
                                    80

                                /        \

                             -1            -1
                             30            100

                           /   \         /

                          0     0       0
                         15     40     90 
                        /  \       
                       0    0      
                      10    20     

We then insert 5 and then check the balance factors from the new leaf up toward the root. (Always check from the bottom up.)

                                    -2
                                    80

                                /        \

                             -2            -1
                             30            100

                           /   \         /

                         -1     0       0
                         15     40     90 
                        /  \       
                      -1    0      
                      10    20     
                     /
                    0
                    5

This reveals a balance factor of -2 at node 30 that must be fixed. (Since we work bottom up, we reach the -2 at 30 first. The other -2 problem will go away once we fix the problem at 30.) The fix is accomplished with a right rotation at node 30, leading to the following picture.

                                    -1 
                                    80

                                /        \

                              0            -1
                             15            100

                           /   \         /

                         -1     0       0
                         10     30     90 
                        /      /  \
                       0      0    0
                       5     20    40

Recall that the mirror image situation is also included under subcase A. The following is a general illustration of this situation. The fix is to use a single left rotation at P. See if you can draw a picture of the following after the left rotation at P. Then draw a picture of a particular example that fits our general picture below and fix it with a left rotation.

                              (rest of tree)
                                     |
                                    +2
                                     P

                                /        \

                              sub         +1
                             tree         RC
                              of
                            height      /    \
                              n
                                      sub     sub
                                      tree    tree      
                                       of      of
                                     height   height
                                        n       n
                                                 \
                                                  X
Subcase B:

This consists of the following situation, where P denotes the parent of the subtree being examined, LC is P's left child, NP is the node that will be the new parent, and X is the new node added. X might be added to either of the subtrees of height n-1. Note that inserting X makes P have a balance factor of -2 and LC have a balance factor of +1. The -2 must be fixed. This is accomplished by doing a double rotation at P (explained below). (Note that the mirror image situation is also included under subcase B.)

                              (rest of tree)
                                     |
                                    -2
                                     P

                                /        \

                             +1           sub
                             LC           tree   
                                           of
                           /    \         height
                                            n
                       sub       -1
                       tree      NP
                        of      /  \
                      height   sub  sub
                        n     tree  tree
                               n-1   n-1
                               /
                              X

The fix is to use a double right rotation at node P. A double right rotation at P consists of a single left rotation at LC followed by a single right rotation at P. (In the mirror image case a double left rotation is used at P. This consists of a single right rotation at the right child RC followed by a single left rotation at P.) In the above picture, the double rotation gives the following (where we first show the result of the left rotation at LC, then a new picture for the result of the right rotation at P).

                              (rest of tree)
                                     |
                                    -2
                                     P

                                /        \

                             -2           sub
                             NP           tree   
                                           of
                           /    \         height
                                            n
                        0        sub
                       LC        tree
                     /    \       n-1
                   sub    sub
                  tree    tree
                   of      n-1   
                 height    /
                   n      X

Finally we have the following picture after doing the right rotation at P.

                              (rest of tree)
                                     |
                                     0
                                    NP

                                /        \

                              0            +1 
                             LC             P
                                         
                           /    \         /    \
                                         
                        sub     sub      sub   sub
                       tree     tree    tree   tree
                        of       n-1     n-1    of
                      height     /            height
                        n       X                n

Consider the following concrete example of subcase B.

                                    -1
                                    80

                                /        \

                              0             0
                             30            100

                           /   \         /     \

                         -1     0       0       0
                         20     50     90      120
                        /      /  \
                       0      0    0 
                      10     40    60

After inserting 55, we get a problem, a balance factor of -2 at the root node, as seen below.

                                    -2
                                    80

                                /        \

                             +1             0
                             30            100

                           /   \         /     \

                         -1     +1      0       0
                         20     50     90      120
                        /      /  \
                       0      0    -1
                      10     40    60
                                  /
                                 0
                                 55

As discussed above, this calls for a double rotation. First we do a single left rotation at 30. This gives the following picture.

                                    -2
                                    80

                                /        \

                             -1             0
                             50            100

                           /   \         /     \

                        -1      -1      0       0
                        30      60     90      120
                       /  \    /
                      -1   0  0 
                      20  40  55 
                     /          
                    0           
                   10           

Finally, the right rotation at 80 restores the binary search tree to be an AVL tree. The resulting picture is shown below.

                                     0
                                    50

                                /        \

                             -1             0
                             30            80 

                           /   \         /     \

                         -1     0      -1       0
                         20     40     60      100
                        /             /       /   \
                       0             0       0     0
                      10             55      90   120

Example Program

This example program inserts some characters into an AVL tree, uses a print routine to see that the AVL tree is correct, and tries out other features such as the copy constructor, the Find function, etc.

The class AVLClass is derived by public inheritance from the class BSTClass. Since we have already implemented binary search trees and AVL trees are a form of specialized binary search tree, this allows considerable code reuse. The public functions provided are a constructor, a copy constructor, a destructor, an overloaded assignment operator, and a function to insert a new item. The following public functions are inherited from BSTClass and hence are also available: NumItems, Empty, Find, and Print. AVLClass also has numerous private functions to carry out the various rotations and the like.

Note that the Print function prints the binary search tree sideways, as it is much easier to do so. Note, too, that AVLClass is named as a friend of BSTClass so that it can have direct access to the private data fields of the latter class. This makes the coding simpler. There are no new data fields added in the derived class.

AVLNodeClass is derived by public inheritance from BSTNodeClass. In the BSTNodeClass note that the data fields are protected fields, instead of the usual private fields. (Public fields are also possible but rarely used since they violate the principle of information hiding.) A protected field is directly accessible in a publicly derived class, but is not accessible elsewhere, such as from our application program in the main function. This means that we do not need to make AVLNodeClass a friend of BSTNodeClass in order for the derived class to have direct access to the data fields. The AVLNodeClass only needs to add one data field to the inherited ones: a field to hold the balance number for this node.

One messy feature of the above example is that in many places a cast is required so that a pointer to a BSTNodeClass node is seen as a pointer to an AVLNodeClass node. For example, the CopySubtree function in AVLClass contains:


NewLeftPtr = CopySubtree(reinterpret_cast <AVLNodePtr> (Current->Left))

This is needed because the data field Left is a pointer to the wrong type of node, a BSTNodeClass node. Since CopySubtree expects a pointer to an AVLNodeClass node, we need the cast. One type of node is derived by inheritance from the other and is almost the same, but to the compiler they are different types. So a pointer to one type of node is not of the same type as a pointer to the other type of node. Hence we need the cast. Similar casts occur in the application code found in the main function, such as:


Result = reinterpret_cast <AVLNodePtr> (AVLTreeA.Find('E'))

This one is needed since we are using the inherited Find function on an AVL tree, which belongs to the derived class. Our Find function returns a pointer to the wrong type of node, so we fix it up with the cast. Code reuse helps a lot, but here it does add the nuisance of a cast.

Efficiency Concerns


Here is a summary of what Tenenbaum and Augenstein say about efficiency. See their text (noted below) for more details.

The maximum height of an AVL tree is 1.44 * lg n, which is an O(lg n) function. This means that in the worst possible case, a lookup in a large AVL tree needs no more than 44% more comparisons than a lookup in a completely balanced tree. Even in the worst case, then, AVL trees are efficient; they still have O(lg n) lookup times. On average, for large n, AVL trees have lookup times of (lg n) + 0.25, which is even better than the above worst case figure, though still O(lg n). On average, a rotation (single or double) is required in 46.5% of insertions. Only one (single or double) rotation is needed to readjust an AVL tree after an insertion throws it out of balance.

References

  • Data Structures Using Pascal, 2nd ed. Aaron M. Tenenbaum, Moshe J. Augenstein. Prentice-Hall (1986). See page 461 and following.
  • Data Structures with C++. William Ford, William Topp. Prentice-Hall (1996). See page 713 and following.

Related Item


Binary Trees

Back to the main page for Software Design Using C++

Authors: Br. David Carlson and Br. Isidore Minerd
Last updated: August 27, 2009