Software Design Using C++
Linked Lists
What Is a Linked List?
A linked list is a sequence of items that can only be accessed in order
(that is, from first to last). Each data item is stored in a node
which also contains a pointer to the next node. Thus, a linked list can be
pictured as shown below. Note that each node is pictured as a box, while each
pointer is drawn as an arrow. A NULL pointer is used to mark the end of the list.
Working with linked lists can be fun because they are easy to draw, yet
give a useful way to organize data. In certain places lists can be tricky,
but a good drawing can go a long way toward clarifying such a situation.
Notice that the above list gives the names of people in alphabetical order.
You might wonder how this compares with storing an ordered list of names
in an array. The key difference is that an array gives "random access"
in that you can directly access any data item by its index number. This
allows one to use binary search instead of the slower sequential search,
for example. A linked list gives only sequential access to the data, so
that only sequential search is available.
However, there are other comparisons that could be made between the two.
Although, the array method allows the faster search to be used, if we want
to insert a new item after the one found, we are in trouble! There is
probably no room in the array to insert a new item at this point. Instead,
all items to the right of this location have to be slid over to the next
array location. This is a slow process but does make room to insert the
new item. As we will see more clearly when we code linked lists later,
it is fairly easy to insert a new item into an ordered linked list. In fact,
it is just a matter of changing a couple of pointers.
Deletion of an item from an ordered list is also a simple matter of changing
a pointer, whereas in an ordered array, all of the data to the right has to
be shifted left to fill in the hole left upon deletion. Thus, once again,
the linked list is faster and more flexible. In general, a linked list is
probably the better method if you are going to dynamically insert and delete
items a lot. (A linked list is one example of a dynamic data structure.
This means that it can easily grow, contract, and change
as needed. An array is a static data structure.)
Another comparison could be made based on the amount of space that the two
methods use. With the array scheme, you would need to set the size of the
array in advance, overestimating the amount of space needed. However, the linked
list method also wastes space, in that it needs room in each node for a pointer.
In general it is not possible to say which method wastes more space.
A Pointer Implementation
It is very common to implement a linked list using pointers and either
structures or objects for the nodes. We will choose objects for the nodes
and set up the class ListNodeClass as shown below.
class ListNodeClass
{
private:
ItemType Info;
ListNodeClass * Next;
public:
// First, the constructor:
ListNodeClass(const ItemType & Item, ListNodeClass * NextPtr = NULL):
Info(Item), Next(NextPtr)
{
};
void GetInfo(ItemType & TheInfo) const;
friend class ListClass; // very convenient to allow this
};
typedef ListNodeClass * ListNodePtr;
|
Note that the two data fields are private, as usual. The type ItemType
would have to be set up to handle whatever data we want to hold in
the list. The Next field is a pointer to another
ListNodeClass object, since the * refers to whatever
Next points to.
The most important function is the constructor. It takes a data item and
a pointer as its parameters and uses them to fill in the fields of the
object being constructed. Note that the pointer parameter has a default
value of NULL . Recall that the Info(Item), Next(NextPtr)
means to copy Item into Info and
NextPtr into Next . The GetInfo
function simply sends back a copy of the data from the object's Info field.
The ListNodeClass is also told that it has a friend class named
ListClass . Functions of the friend class have direct access
to the private data fields (Info and Next ) of
ListNodeClass objects. It is not necessary to allow this
direct access, but it is easier than the alternative: using get and set
class functions every time a ListClass function wants to
look at or change the private fields of a ListNodeClass object.
Some people refuse to use friend classes, however, because they give more
access than is really necessary, thus partially defeating the information
hiding that classes provide.
Finally, note that the above code sets up ListNodePtr as a
convenient type name for a pointer to a node. This means that we
won't have to use ListNodeClass * from here on in the rest of
the code, but can instead use ListNodePtr .
Next, let's set up a class declaration for list objects. Each list will have
three private data fields: a pointer to the first node on the list, a pointer to
the last node on the list, and a count of how many data nodes are in the list.
The Front pointer will point to what is called a dummy node.
This is a node containing no data. It is easier to code the insertion and deletion
operations if the first node on the list contains no data. Otherwise, the code to
insert a new first node or to delete the current first node must be different from
the code to handle the normal cases.
You might ask why we have included a pointer to the last node of the list.
This changes our original simple drawing of a linked list. However, it is
sometimes useful to have a quick way to get to the end of a list, so we
will include this feature. Similarly, the Count field is
a convenience. Whenever we need to know how many data items are in a list,
we just use this field instead of traversing the list and counting data nodes.
The following is the class declaration for ListClass . Helping
functions, which are only used by other ListClass functions
are made private, so that they cannot be accessed otherwise. (This is another
instance of information hiding.) There is a comment that the three
data fields are sometimes made into protected fields. The purpose of this is
that if we ever created a subclass (using inheritance), functions of the
subclass would be able to directly access these fields. With private fields,
the fields would be inherited, but not directly accessible by the subclass.
Since we do not plan to inherit a new class from ListClass ,
this is not a concern.
class ListClass
{
private:
ListNodePtr GetNode(const ItemType & Item,
ListNodePtr NextPtr = NULL);
void FreeNode(ListNodePtr NodePtr);
void ClearList(void);
// Next 3 are sometimes made into protected fields:
ListNodePtr Front, Rear;
int Count;
public:
// constructor:
ListClass(void);
// destructor:
~ListClass(void);
int NumItems(void) const;
bool Empty(void) const;
void InsertFront(const ItemType & Item);
void InsertRear(const ItemType & Item);
void InsertInOrder(const ItemType & Item);
ItemType RemoveFront(void);
ListNodePtr Find(const ItemType & Item) const;
};
|
The public functions above provide the usual linked list operations. The
above class declaration thus shows that, from an abstract data type
point of view, the list operations are list creation, list destruction,
getting the number of items in the list, checking if the list is empty,
inserting an item at the front of the list, inserting an item at the rear
of a list, inserting an item into an ordered list, removing an item from
the front of a list, and finding an item in the list. Certainly
other operations are possible. For example, why not have an operation to
remove the item from the rear of a list? (One practical reason not to do
so is that it is not easy to adjust the Rear pointer after deleting the
last node and to get a NULL into the Next field of the new last node.
The only way to do so appears to be traverse the list from the first to the
last node, before removing the final node, in order to get a pointer to the
next to the last node!)
A ListClass object, named List is pictured below. Functions
are not shown in the objects in order to keep the diagram from becoming
cluttered, but if they were, the private functions should be shown as
completely boxed in, whereas the public functions should be shown in open
boxes to indicate that they are available for public use. Note that a
ListClass object doesn't really contain the data in the list.
Rather, the data is contained in node objects that are outside of the
List object itself. The List object
simply contains pointers to the first and last nodes of the list.
The complete code files for this example are given below. The last file
contains a simple test program to try out several linked list operations.
This test program is straightforward and will not be commented upon here.
Read through these files to get the general idea, then go on to the notes that follow.
In the next to the last file above, you will note that the constructor
initializes a ListClass object to contain a Count
of zero, and to have Front and Rear pointers that
point to a dummy node. Of course, this means that it has to create a
dummy node. This is handled by the helping function GetNode .
In turn, GetNode uses new with the
ListNodeClass constructor to dynamically allocate a new node.
Since a ListClass object points to other objects outside of
it, it is important to have a destructor to clean up the space used by
these ListNodeClass objects. The destructor uses the
ClearList function to reclaim the space used by the data nodes
and then calls FreeNode(Front) to get rid of the dummy node.
The ListClass object itself is automatically removed.
The FreeNode function is trivial in that it just uses
delete to remove the node pointed to. The ClearList
function, however, is more interesting. Let's look at it in more detail since
it is a good example of a list processing function.
/* Given: Nothing.
Task: Clear out all nodes on the list except the dummy.
The list is thus set to an empty list.
Return: Nothing directly, but the implicit ListClass object is modified.
*/
void ListClass::ClearList(void)
{
ListNodePtr Current, Temp;
Current = Front->Next;
while (Current != NULL)
{
Temp = Current;
Current = Current->Next;
FreeNode(Temp);
}
Front->Next = NULL;
Rear = Front;
Count = 0;
}
|
In essence the idea is to traverse the list, freeing up each data node as it
is encountered. The Current pointer is used to point to the
successive nodes on the list. Note that it is initialized to point where
the dummy node's Next field points, that is, to point to the
first data node. Then we have a while loop. Inside of it,
the Current pointer is copied into Temp ,
Current is advanced by placing in it
a copy of the Next field of the node we have been pointing at,
and FreeNode is used to free up the node that Temp
points to. Notice that we can't simply free the node that Current
points to each time around the loop. If we did that we would lose our pointer
into the linked list and have no way to advance to the next node. The loop
stops when Current is advanced to pick up the NULL
value marking the end of the list. After the loop ends, the
ListClass object is adjusted so that the Front and
Rear pointers are NULL and the Count is zero.
In other words, the list object is changed into the typical picture of an empty list.
Let's look at a few more of the list manipulation functions, starting
with InsertFront .
/* Given: Item A data item.
Task: To insert a new node containing Item at the front of the implicit
ListClass object.
Return: Nothing directly, but the implicit object is modified.
*/
void ListClass::InsertFront(const ItemType & Item)
{
ListNodePtr NodePtr;
NodePtr = GetNode(Item, Front->Next);
Front->Next = NodePtr;
if (Count == 0)
Rear = NodePtr;
Count++;
}
|
It is clear that the GetNode function is being used to create
a node to hold the new data item. The second parameter to this function
is used to initialize the Next field of this new node so that
it points to the first data node. At this point we have the following picture:
To finish things off, the function adjusts the Next field
of the dummy node so that it points to the new node. The Count
field for the list also needs to be incremented. That finishes things for
the example in our picture.
Unfortunately, it is not always quite that simple. There is a special case
lurking in the background. What happens if we insert at the front of an
empty list? Remember that an empty list looks like this:
The steps we previously did to insert a new node at the front of the list
are still fine, but they leave one thing out: they do not adjust the
Rear pointer, currently pointing at the dummy node, to point to the
newly inserted node. That is handled by the if in the code above.
You may not have noticed it, but the ListClass functions are
directly accessing the private data fields of the ListNodeClass
objects. That would not normally be allowed, but it is legal here because
we made ListClass to be a friend of ListNodeClass .
Next, let's look at the InsertInOrder function. Note that it
assumes that the list is already in order.
/* Given: Item A data item.
Task: Insert Item into a new node added to the implicit ListClass object
(assumed to already be in ascending order based on the Info
field) so as to maintain the ascending order.
Return: Nothing directly, but the implicit object is modified.
*/
void ListClass::InsertInOrder(const ItemType & Item)
{
ListNodePtr Current, Last, Temp;
bool Proceed;
Temp = GetNode(Item);
Current = Front->Next;
Last = Front;
Proceed = true;
while (Proceed && (Current != NULL))
if (Item > Current->Info)
{
Last = Current;
Current = Current->Next;
}
else
Proceed = false;
Last->Next = Temp;
Temp->Next = Current;
if (Current == NULL) // item was inserted at rear of list
Rear = Temp;
Count++;
}
|
Suppose that we have an ordered list containing 4, 6, 9, and that we want
to insert 8 using the above function. Note that it moves a pair of pointers,
Current and Last , through the list, with
Last always pointing to the node before the one that Current
points to. Current starts out by being initialized to
point to the first data node, while Last points to the dummy
node. Each time around the while loop a check is made to see
if the 8 is greater than the item contained in the node that Current
points to. If so, the pointers are advanced so that each points to
the next node. If not, then the place to insert has been found, so the
loop is stopped by setting the Proceed flag to false .
The picture at the point where the insertion location
has been found is shown below:
All that remains to be done is to link in the new node, up the Count
by 1, and check for any special cases. It turns out that the only
unique case is when the new item ends up going at the very end of the list. In such
a case the Rear pointer must be adjusted (as shown in the code above).
The RemoveFront function is simpler. However, it exits with
an error message if there is no data item to remove. If not, it sets up
NodePtr to point to the first data node and extracts the
value from this node's Info field. It then adjusts the
Next field of the dummy node so that it points to the second data
node, skipping around the first one. The Count is decremented
and the first data node is freed up. Not surprisingly, there is a special
case. This occurs when the list only has one data node. After removing it,
the Rear field must be adjusted to point to the dummy node.
The Find function is similar to the ClearList and
InsertInOrder functions, but is simpler that the latter.
Therefore this function will be left to the reader to trace. It is suggested
that one draw pictures like those shown above to aid in tracing this function.
An Array Implementation
This type of implementation of a linked list is sometimes useful. It suffers
from the usual array problem of having to estimate the needed size of the
array in advance. There are some ways around this problem (such as using
a vector instead of an array, allocating space for the array dynamically,
or even allocating space for additional arrays as needed), but they have
their own complications. We will assume here that the maximum size needed
can be foretold. The array will hold a fixed number of records (structures),
each of which represents a node.
In this implementation, a linked list object contains an array of nodes,
perhaps called NodeArray , and four integers fields labelled
Avail , Front , Rear ,
and Count . Each
record in NodeArray contains an Info field
and an integer Next field. Avail ,
Front , Rear ,
and each Next field all contain fake pointers. What we use as
a fake pointer to a node is the array index of that node. To represent the
NULL pointer we can use an illegal array index, such as
-1. We will not start our list with a dummy node, though
it is possible to do so by making a few small changes.
The picture of an empty list is as follows:
The idea is that Front points to the first node of the list.
Since it is -1, our imitation NULL pointer,
this means that we have an empty list. The Count of zero
also indicates the same thing. Rear is -1, indicating that
there is no node at the rear of the list (as the list is empty).
The Avail field is intended
to point to a list of free nodes. The programmer must arrange to manage
free space within the array with this method! In out picture above, all
of the nodes are free and are linked together in a list with
Avail containing zero and thus
pointing at the first free node, NodeArray[0] . Then this
node has a Next value of 1, showing that it points to
NodeArray[1] , etc. Finally, NodeArray[4].Next
contains our imitation NULL pointer to mark the end of the list.
After items have been dynamically inserted into the list and perhaps items
have been deleted as well, we might end up with a picture such as the following.
Can you read what is on this list? All you do is follow the (fake)
pointers. Since Head contains 1, the first node contains
the string "Ann". This node contains a pointer to node 0, so the second
node contains "Sam". This node contains a pointer to node 4, so the third
node contains "Mary". Finally, this node has -1 in the Next
field, marking the end of the list. So the three data items in the list are "Ann",
"Sam", and "Mary". If you need a quick way to get to the last node on the
list, you use the 4 found in the Rear field.
The list of available (free) nodes contains node 2 and node 3.
Note that such a list object always contains two linked lists: the list of
data nodes and the list of available (free) nodes. It is left as an exercise
to the reader to implement a linked list class using this array-based scheme.
Variations on Linked Lists
Many variations on the basic idea of a linked list are possible. A few of
these are described in the sections that follow.
Doubly-Linked Lists
A doubly-linked list has two pointer fields in each node, often named
Left and Right . If the linked list is pictured
as a horizontal sequence of nodes, the Right pointers are used
to traverse the lists from left to right (that is, from beginning to end).
However, the Left pointers can be used to back up to the left
whenever that is desired. The following is one possible picture of a
doubly-linked list. Sometimes dummy nodes are added at the front and rear as well.
One can thus traverse a doubly-linked list in both the forward
and backward directions. This can make coding simpler, for example, when
writing a sequential search function to find a given node in the list, with
the plan that we will use this to find the node after which to insert a new
node. With a singly-linked list we needed to return both a pointer to the
node found and a pointer to the previous node. With a doubly-linked list
it suffices to return a pointer to the matching node, since we can always
follow its Left field to get at the previous node.
Circular Lists
A (singly-linked) circular list has the Next field of the last
node point to the first node of the list. This can be useful when there is
a need to access data in a circular (wrap-around) fashion. It is possible
to have a doubly-linked circular list as well. Below is a picture of a
singly-linked circular list.
In the drawing above, the list object contains only one pointer field, a
pointer to the node at the rear of the list. The convention used in this
scheme is that the next node is the first node of the list, which is followed
by the second node, etc. until we wrap around to get the last node once again.
Multi-Linked Lists
Nodes can be on several lists simultaneously. All that is needed is for
each node to have a pointer field for each list. Thus the first pointer
field in each node could be used for the first list, the second pointer
field in each node could be used for the second list, etc. The curious reader
is encouraged to draw a picture of an example.
More advanced information on lists can be found under
A List-Based Table in the advanced section of
these Web pages. That example uses inheritance and so is considerably
more involved than what we have done here.
Related Items
Back to the main page for Software Design Using C++
|