CIS Logo SVC Logo

   Computing & Information Systems
   Department

 

Schoology Facebook        Search CIS Site      Tutorials

Software Design Using C++



Records (Structures)



Basics

A record, commonly called a structure in C++, is a collection of related data items. Unlike with an array, the data items in a record can be of different types. Also, with a record the data items are identified by field name, whereas with an array the data items are identified through the use of an index number.

For example, to create a simple record holding a part number and a price, we might use a typedef to create a record type (here called PartType) as follows:


typedef struct
   {
   int PartNumber;
   float Price;
   } PartType;

Another way to create the same type is the following:


struct PartType
   {
   int PartNumber;
   float Price;
   };

Dot notation is used to access a given field of a record. For example, the following assigns values to the two fields of a record named Part:


PartType Part;

Part.PartNumber = 123;
Part.Price = 9.95;

Similarly one would use the dot notation to look up the value in a field of a record. Thus we might use cout << Part.Price; to print the price contained in record Part.

One can picture a record as a box subdivided into sections for the various fields. The Part record above could be pictured as follows:

[drawing of a record]

Note that it is possible to assign one record into another record variable of the same type. This merely copies all of the data from the fields of the record on the right hand side into the corresponding fields of the record variable on the left hand side. Thus we could use something like the following:


PartType Temp;
Temp = Part;

A Simple Working Example


Next, let's look at a complete example using records. This example requires you to create a project containing the following three files. Note that the #pragma once is a way to prevent this header file from being included more than once into the project.

We leave it to the reader to look at these three files and to try them out. Here we go on to a somewhat more complex version of the same program and describe it in detail.

A Somewhat More Involved Working Example


As an aside, note that it is often useful to divide a project up into several source code files. A division like that shown above is fairly common. The header file usually contains things like constants, typedefs, and function prototypes. Such a file is named so as to have a .h extension. The .cpp files contain C++ functions. Of course, only one of the .cpp files can contain a main function. See the two links below for outlines of a typical .h file and a typical .cpp file:
  • proto.h
    An outline of a typical header file.
  • proto.cpp
    An outline of a typical .cpp file containing a main function.
When you examine the employee.h file you will see that it sets up a type called EmployeeType for records containing a first name, a last name, an ID number, and a wage rate. The code for this is repeated below for convenience. Note that the FirstName and LastName fields are arrays of characters, that is, simple strings.


struct EmployeeType
   {
   char FirstName[NameMax];
   char LastName[NameMax];
   int ID;
   float WageRate;
   };

In the employee.cpp file are the functions that have been written for dealing with employee records of the type just shown. The ReadEmployee function is used to read data from the keyboard and place it in an employee record. The following shows the overall comments for this function. Note that the Employee formal parameter is a reference parameter. This is so that data can be sent back out of the function via this parameter. Also note that a return code is sent back in the function name. This code will be one of the constants which are set up in employee.h.


/* Given:  Nothing.
   Task:   To read from the keyboard data for one employee.
   Return: Employee   A structure containing the data just read.
           In the function name return FailFlag or OKFlag
           to indicate how input behaved.
*/
int ReadEmployee(EmployeeType & Employee)

In the actual code for the ReadEmployee function note that using something like cin >> Employee.LastName; is of limited usefulness in that it stops placing data in LastName at the first whitespace character. If this is a newline (from pressing the Enter key), great. However, it could be that the user of the program entered Barker III as the last name. In this case only Barker gets placed in the LastName field of the Employee record.

Also notice that the fail function is used after every data data entry command to see if the user pressed CTRL z to indicate end of data (EOF). If so, we immediately exit from the ReadEmployee function by doing return FailFlag. (Of course, in Linux CTRL d is used instead of CTRL z to indicate the end of data entry. If you accidentally use CTRL z in Linux, it will suspend your program. In such a case you can most likely bring your program back into the foreground with the command fg 1 and then continue to run your program.)

Another function in the same file is PrintEmployee. As you can see below it is passed a record of employee information and prints its data on the screen. Since the Employee parameter is only being used to send data into the function you might wonder why it is set up to be a reference parameter. The answer is that a record can contain a fairly large amount of data. It is time-consuming to copy all of it, and it takes extra space on the run-time stack for this copy. So for the sake of efficiency we typically make record parameters to be reference parameters. Note, however, that const is used in front of this parameter. This indicates that we will not change anything in the Employee parameter, so that no data is sent back in this parameter.


/* Given:  Employee   A structure containing info on one employee.
   Task:   To print the data from Employee on the screen.
   Return: Nothing.
*/
void PrintEmployee(const EmployeeType & Employee)

The oneemp.cpp file contains the main function. It is a simple test program that reads in data for one employee into a record and then prints the data out onto the screen. It should be fairly clear how it works.

Using an Array of Records

An array of records is useful for temporarily storing a sequence of records. The two multiple source file projects below use just such an array to hold the employee data read in for a sequence of employees. Each program then finds and prints the average wage rate for the employees and uses the stored data to print the employee records for those employees who earned above the average. As above we show a simple version first and then a somewhat more complex one.

The Simpler Array of Records Example

As before we let the reader examine the code in this example and proceed to the more complex example, where we take a closer look.

The More Complex Array of Records Example

The first two files listed above are the same as the files by the same name used in the project whose main function was in oneemp.cpp. The new files include emparray.h, a header file containing a constant for the maximum number of employees that the array will be able to contain as well as a typedef setting up EmpArrayType as a type for such an array of records.


const int EmpMax = 50;   // Max number of employees
typedef EmployeeType EmpArrayType[EmpMax];

An array of type EmpArrayType, perhaps named EmpArray, can be pictured as a numbered sequence of records as shown in the picture below. This picture assumes that the array has been filled with data, but in some cases we might only fill part of it. To keep track of how much data we have in the array is, of course, the purpose of the EmpCount variable that you see in the main function of this program. For example, if EmpCount = 6, then you know that EmpArray has data at indices 0 through 5.

[drawing of array of employee records]

In the emparray.cpp file is found the code for the following LoadArray function. Note that it returns data via both parameters and that it sends an integer code back in the function name.


/* Given:  Nothing.
   Task:   To read a sequence of employee data from the keyboard,
           storing it in EmpArray.
   Return: EmpArray   The array of employee data.
           EmpCount   The number of employees just entered.
           In the function name return OKFlag or 
           TooMuchDataFlag to indicate how this function ended.
*/
int LoadArray(EmpArrayType EmpArray, int & EmpCount)

Internally, the LoadArray function calls our existing ReadEmployee function each time that it needs to read in the data for one employee. Note that the usual while loop pattern is used in placing this data into the array, with care taken to be sure that we do not run off of the end of the array (a common source of run-time errors). When this loop ends an if is used to decide how it ended. If ReadEmployee was unable to read data (undoubtedly because the user entered CTRL z to quit, or CTRL d in Linux), we use cin.clear() to clear the internal flags that are tested by the fail function. If we did not use this clear function we would be unable to get further input from the keyboard. This would be because we have already told the program that we are at the end of the data (EOF) by pressing CTRL z. (To see that we do actually do further input, check the main function in the aboveavg.cpp file.)

The AverageRate function is also found in the emparray.cpp file. It gets data from its two parameters, EmpArray and EmpCount, and uses these to find the average wage rate, which it returns in the function name.


/* Given:  EmpArray  Array of employee data from index 0 to Count-1.
           EmpCount  Number of records of data stored in EmpArray.
   Task:   To find the average wage rate for the employees in EmpArray.
   Return: In the funtion name, the average wage rate.
*/
float AverageRate(EmpArrayType EmpArray, int EmpCount)

At the start of this function you see that we have used the statement assert(EmpCount > 0); as a way to ensure that EmpCount is positive. We do not want to ever go on and divide by zero, giving a run-time error! The assert statement checks whether the condition inside is true and if so allows the program to continue. If the condition is not true, the program is aborted with a message that this condition failed to be true. Although this is one way to avoid crashing a program by dividing by zero, it might be more gracious to print our own customized error message and then to exit the program. Assert statements are more commonly used in the debugging stage of program development. Once one is sure that things are working fine, those asserts are then removed.

In the same file one also finds the PrintAboveAvg function whose job is to print the data for all employee records with a wage rate above the average. There is one new feature used inside the for loop in the code, shown below in simplified form. This is the comma operator. You already know that the three parts of the for line are separated by semicolons. Thus the initialization step consists of k = 0, NumPrinted = 0. The comma can be used to separate a sequence of commands. If a return value is needed from the combined command, the return value from the final statement is used. As we are not using any return value here, the effect is just to combine two initializations into one. Since both variables are being initialized to the same value, another way to achieve this would be to use k = NumPrinted = 0.


for (k = 0, NumPrinted = 0; k < EmpCount; k++)
   if (EmpArray[k].WageRate > AvgRate)
      {
      PrintEmployee(EmpArray[k]);
      NumPrinted++;
      }

It is left to the reader to study the rest of this program example.

A Program to Look Up Records in an Array



A Software Engineering Example


Suppose that we want a program that gets the data for a bunch of employees from the keyboard and then allows the user to interactively look up employee data by entering the names of target employees. Let's use the software development life cycle to assist us with this problem.

Analysis

In the first step of the software development cycle we need to get a clear idea of what the program should do. This is usually done by specifying the inputs, processing, and outputs.

In our case we do need to say more precisely what the input data is. The data entered at the keyboard for each employee, let's say, consists of the employee's first and last name, ID number, and wage rate, exactly as we did in the two previous programs. (One reason to do this is so that we can reuse code from those programs. In particular, we would expect to be able to reuse EmployeeType from employee.h and perhaps other things from employee.h, employee.cpp, emparray.h, and emparray.cpp).

The processing seems to be fairly clear. We want the user to interactively enter the data for a series of employees and then be able to repeatedly look up employee data by entering the first and last name of each desired employee. Let's fill this out by deciding that the user is to use CTRL z to end the lookups (or CTRL d under Linux).

The output is also fairly clear. Other than simple prompts that ask the user to enter data, the program needs to present, for each lookup, all of the data on the employee (if a match is found). This data, of course, consists of the first and last names, ID number, and wage rate.

Design

Let's start by considering the data and what we need to hold the data. For data we have a series of employee records that we need to keep around while the user looks up individual records. Thus, this seems to be another problem in which to use an array of employee records. That should also allow us to reuse more code from the previous problem, such as the function to load the array with employee records and the function to print an employee record.

Next, let's design the functions that we hope to use. Suppose we initially decide to divide the problem up into functions as shown in the following structure chart:

[structure chart]

Some of the functions shown in this chart have already been written for use in the previous two problems. Thus we simply reuse the functions LoadArray, ReadEmployee, and PrintEmployee. For the other two functions, let's try the following design:


/* Given:   EmpArray  Array of employee records, already in order.
            EmpCount  Number of employee records in EmpArray.
   Task:    To allow the user to do repeated lookups for the data on
            an employee by entering the person's name.
   Return:  Nothing.
*/
void Lookups(EmpArrayType EmpArray, int EmpCount)

/* Given:   EmpArray   Array of employee records.
            Low        The low index of the range to search.
            High       The top index of the range to search.
            Target     Record containing name for which to search.
   Task:    To do a binary search for Target in the specified range of
            EmpArray.
   Return:  In the function name, return the index of where Target was
            found or -1 if it was not found.
*/
int BinarySearch(EmpArrayType EmpArray, int Low, int High,
   const EmployeeType & Target)

Note that we have decided to put the first and last names together into the Target record, as that is easier than having two separate parameters. That also brings to mind a problem: Even though we have written a binary search function before, it was not designed to handle records. How will we have the binary search compare two records to decide if they are equal (according to the last name and first name), if the first record is less than the second (alphabetically by the two names), etc? A good way to do this is to have an EmpCompare function to do this for us. Thus we decide to add the following function. Since it has to do with individual records we will place it in employee.h.


/* Given:   First   An employee record.
            Second  Another employee record.
   Task:    To decide if, based on last name then first name, First
            is less than, equal to, or greater than Second.
   Return:  In the function name return:
               -1 if First < Second
               0  if First == Seond
               1  if First > Second
*/
int EmpCompare(const EmployeeType & First, const EmployeeType & Second)

Prototyping

Instead of proceeding to code the whole project, let's first build a prototype. However, since a number of the helping functions have already been written, our prototype may be more complete than would normally be the case. Thus we can use the existing LoadArray and PrintEmployee functions. Let's assume that we do not yet write the code for the BinarySearch and EmpCompare functions, though if you look in our old employee.cpp and emparray.cpp files you will see that these two functions were already there.

Take a look at the prototype, recfind.cpp. In the Lookups function note that the call to BinarySearch has been commented off, since we are assuming that this function has not yet been written. Instead, we just give Result the arbitrary value of 0, so that the program will print out the record at index zero for every lookup. This isn't correct, of course, but it allows us to see that significant portions of the program are in working order. It also allows the user to try out the software to see if it appears that (when finished) it will do what is desired. Although the data input is rather clumsy for the target employee (as one has to enter a last name before being allowed to use CTRL z to halt), the program seems to work OK. Thus we move on to the coding phase.

Coding

Because of the clumsy data input noted above, let's decide to code a separate function to handle getting the first and last names for an employee to look up. Thus, we are cycling back to the design stage and modifying the design. The new function is outlined as follows:


/* Given:   Nothing.
   Task:    To prompt the user to enter the last and first name of an
            employee.
   Return:  Employee   Record containing these names, other fields unused.
            In the function name, return EOFFlag if CTRL z was entered,
            OKFlag otherwise.
*/
int GetTarget(EmployeeType & Employee)

The code for this function is found in our new recfind.cpp file. Of course, some of the Lookups function had to be modified to accommodate the GetTarget function. We now code the remaining functions as shown below:


int EmpCompare(const EmployeeType & First, const EmployeeType & Second)
   {
   int Value;

   Value = strcmp(First.LastName, Second.LastName);
   if (Value < 0)
      return -1;
   if (Value > 0)
      return 1;

   // If reach this line, LastNames are the same.
   Value = strcmp(First.FirstName, Second.FirstName);
   if (Value < 0)
      return -1;
   if (Value > 0)
      return 1;

   // If reach this line, both first and last names are same.
   return 0;
   }

If you are not familiar with the strcmp function, be sure to look it up by using either the Visual C++ compiler's help system or a good reference book. This function is very useful in comparing strings that are arrays of characters ending in a NULL character.


int BinarySearch(EmpArrayType EmpArray, int Low, int High,
   const EmployeeType & Target)
   {
   int Mid, Diff;

   while (Low <= High)
      {
      Mid = (Low + High) / 2;
      Diff = EmpCompare(EmpArray[Mid], Target);

      if (Diff == 0)
         return Mid;
      else if (Diff < 0)
         Low = Mid + 1;
      else
         High = Mid - 1;
      }

   return -1;   // If reach here, Target was not found.
   }

Testing and Debugging

As a first test we might try entering simple data such as the following, where each line shows a last name, first name, ID, and wage rate:

last name first name ID wage rate
one one 1 11.11
two two 2 22.22
three three 3 33.33
four four 4 44.44

When we try to search for the employee with first and last name one, everything seems to work fine. Unfortunately, when we try to search for the employee with first and last name three that employee is reported as not found. What went wrong? You might want to try the debugger to trace what is happening as binary search tries to locate this employee. You might also draw the array of four records and try to trace the binary search by hand. Probably you now see the problem: The data is not in ascending order! Binary search requires this. Thus we realize that we made a mistake in the design phase: We need to sort the array of data by last name (and secondarily by first name).

We cycle back to the design phase to fix this. Let's use a SelectionSort routine. It can use the EmpCompare function to compare two records based on their names. At this point we redraw our structure chart to show all of the changes made since the original structure chart. Note that EmpCompare is shown twice in the chart since it gets called from two different functions. Do not try to draw a line from these two functions to a single box for EmpCompare as structure charts are commonly drawn as trees.

[structure chart]

We then have to code the SelectionSort function as well. This results in a completed program that appears to be error-free when we run additional tests. See the following files for the details:

Maintenance

If this program were actually put into real use at a company or sold as a product, there would undoubtedly be calls for maintenance. Perhaps what we thought to be an error-free program is not so at all, so that newly-discovered errors have to be fixed. Users might also call for new features, such as the ability to save the data between separate uses of the program. (This would require the use of files.)

Documentation

We have created a fair amount of documentation above. Look at all that has been written down in the above stages, as well as the structure charts that were drawn. Plus there is the internal documentation, the comments included in the code itself. This material would all be saved to assist the maintenance programmers who might work on this project. For a larger program a user's manual and/or a reference manual might also be written.

Back to the main page for Software Design Using C++

Author: Br. David Carlson with contributions by Br. Isidore Minerd
Last updated: September 24, 2017
Disclaimer