Software Design Using C++
What is a Tree?
It is actually hard to define what we mean by a tree, but the idea can be seen in the following picture of an example tree showing course prerequisites. A course with a line to a second course below it is a prerequisite for the second course. The left to right ordering means nothing. We would still have the same tree if 330 were to the right of 310, for example.
At this point it is helpful to introduce some terminology. The root node (or simply root) is the node at the top of the tree diagram. In our case it is the one containing 110. Note that in computer science trees have their root at the top and branch out as you go down! This is simply the customary way of drawing trees.
The parent of a node is the one directly connected to it from above. In our example, 111 is the parent of 350, 230 is the parent of 310, 110 has no parent, etc. A child is a node connected directly below the starting node. Thus 350 is a child of 111, 310 is a child of 230, etc. It is simply the reverse of the parent relationship. Nodes with the same parent are called siblings. So, 221, 230, and 350 are siblings in our example.
An ancestor of a given node is either the parent, the parent of the parent, the parent of that, etc. In our example 110 is an ancestor of all of the other nodes in the tree. The counterpart of ancestor is descendant. For example, 310 is a descendant of 111, but 310 is not a descendant of 221.
The leaves of a tree (sometimes also called external nodes) are those nodes with no children. In our example, 221, 350, 330, and 310 are leaves. Note that leaves do not have to be at the lowest level in the tree. The other nodes of the tree are called non-leaves (or sometimes internal nodes).
Note that each node of a tree can be considered to be the root of a subtree consisting of that node and all its descendants. For example, the subtree rooted at 230 is as follows:
A branch is a sequence of nodes such that the first is the parent of the second, the second is the parent of the third, etc. For example, in the above tree, the sequence 111, 230, 310 is a branch. The length of a branch is the number of line segments traversed (which is one less than the number of nodes). The above branch has length 2.
The height of a tree is the maximum length of a branch from the root to a leaf. The above tree has height 3, since the longest possible branch from root to leaf is either 110, 111, 230, 310 or 110, 111, 230, 330, both of which have length 3.
A more formal definition of a tree can be written as follows: A (general) tree consists of a set of nodes that is either empty or has a root node to which are attached zero or more subtrees. Of course, a subtree itself must be a tree. Thus this is a recursive definition. Note, too, that there is no left to right ordering of the subtrees.
A binary tree is similar to a tree, but not quite the same. For a binary tree, each node can have zero, one, or two children. In addition, each child node is clearly identified as either the left child or the right child. Thus there is a definite left-right ordering of the child nodes. Even when a node has only one child, that child must be identified as either the left child or the right child. As an example, here is the binary expression tree for the expression
Reversing the left to right order of any siblings gives a different binary
tree. For example, reversing the subtrees rooted at
What is a Binary Search Tree?
A binary search tree is a binary tree in which the data in the nodes is ordered in a particular way. To be precise, starting at any given node, the data in any nodes of its left subtree must all be less than the item in the given node, and the data in any nodes of its right subtree must be greater than or equal to the data in the given node. Of course, all of this implies that the data items can be ordered by some sort of less than relationship. For numbers this can obviously be done. For strings, alphabetical ordering is often used. For records of data, a comparison based on a particular field (the key field) is often used.
The following is a binary search tree where each node contains a person's name. Only first names are used in order to keep the example simple. Note that the names are ordered alphabetically so that DAVE comes before DAWN, DAVID comes before DAWN but after DAVE, etc.
How does one create a binary search tree in the first place? One way to do so is by starting with an empty binary search tree and adding the data items one by one. The first item becomes the root. The next item is placed in either a left child or right child node, depending on the ordering. The third item is compared to the root, we go left or right depending on the result of the comparison, etc. until we find the spot for it. In each case we follow a path from the root to the spot where we insert the new item, comparing the new item to each item in the nodes encountered along the way, going left or right at each node so as to maintain the ordering prescribed above.
Traversals of Binary Trees
There are several well-known ways to traverse, to travel throughout, a binary tree. We will look at three of them. The first is an inorder traversal. This consists of three overall steps: traverse the left subtree (recursively), visit the root node, and traverse the right subtree (recursively). When we "visit" a node we typically do some processing on it, such as printing out the contents of the node. For example, an inorder traversal of the above binary search tree gives us the names in this order:
Note that we got the data back in ascending order. This will always happen when doing an inorder traversal of a binary search tree. In fact, a sort can be done this way. One first inserts the data into a binary search tree and then does an inorder traversal to obtain the data in ascending order. Some people call this a tree sort.
Now, how exactly did we get the above list of data? Essentially, we did so by following the recursive definition for an inorder traversal. First we traverse the left subtree of the root, DAWN. That left subtree is the one rooted at DAVE. How do we traverse it? By using the same three-step process. We first traverse its left subtree, the one rooted at BETH. Of course, we then have to go through the three steps on the subtree rooted at BETH. We begin by traversing its left subtree, but it is empty, so we visit the root, BETH. That is the first data item printed. Then we traverse the right subtree, the one rooted at CINDI. We use the three-step process on it, but since its subtrees are empty, we simply print the root, CINDI, which is the second item printed. We then back up to where we left off with the subtree rooted at DAVE. We have now traversed its left subtree, so we go on to print the root, DAVE, and then traverse the right subtree. Since the right subtree itself has empty subtrees, we end up just printing its root, DAVID. We continue in a similar fashion for the rest of this binary search tree.The other two traversals that we will study are the preorder traversal and the postorder traversal. They are very similar to the inorder traversal in that they consist of the same three steps, but reordered slightly. The preorder traversal puts the step of visiting the root first. The postorder traversal puts the step of visiting the root last. Everything else stays the same. Here is an outline of the steps for all three of our traversals.
First we traverse the left subtree of the whole binary tree. This is the
subtree rooted at
For practice try a preorder traversal of the same binary expression tree
The natural conjecture, then, would be that the inorder traversal of a
binary expression tree would produce the infix form of the expression,
but that is not quite true. With the above expression it is true. However,
try the infix expression
As you can verify, an inorder traversal of this binary expression tree
Uses of a Binary Search Tree
A binary search tree can be a very useful data structure. We have already seen that it can be used to create a sort routine. Such a sort routine is normally pretty fast. In fact, it is
Another use of a binary search tree is in storing data items for fast
lookup later. In the average case it is pretty fast to insert a new item
into a binary tree, because in the average case the data is fairly random
and the binary tree is reasonably "bushy". (In such a tree it is known that
the height of the binary tree is
For example, to look up GINA in the binary tree above, one compares GINA to DAWN, the root. Since GINA is larger, move to the right child, MIKE. Now compare GINA to MIKE. Since GINA is smaller, move to the left child GINA. Now compare GINA to the item in the node, also GINA, and we see that we have a match. All lookups are like this. One starts at the root and follows a path from the root to the matching item (or to a leaf if no match is ever found).
A Pointer-Based Implementation of a Binary Search Tree
It is typical to create a binary search tree by using objects as the nodes and pointers to connect the nodes. This is the same approach that we used in implementing linked lists. The following files show the details:
Each node has an
There are two data fields,
As for the class functions, there are private helping functions and the
publicly available ones. The code for these is in the
The private helping functions include one to manufacture a new node (containing
a given data item and left and right pointers), a function to free up the
space used by a node, a function to clear out the space used by the nodes
of the binary search tree (leaving it empty), a function to free up the space
used by a subtree, and a function to try to find a given item in a subtree.
The latter function assists the public
Now examine the code for the class functions. The constructor fills in a
The code for the
Why did we use a postorder traversal and not one of the other traversals? Postorder is used because we can only get rid of a root node after we have gotten rid of all of the nodes in its subtrees. If we got rid of a root node before we got rid of one of its descendants, we would be likely to lose any way to reach this descendant via our pointers.
The next several class functions are very simple, so let's jump down to
This function uses a pair of pointers,
The first two cases in this function are the stopping cases for the recursion.
In much the same way that we created a list-based table, we can create a table that is based on a binary search tree. Both data structures allow us to insert data and retrieve it later. Once again, the new class (