CIS Department > Tutorials > Software Design Using C++

Software Design Using C++

Software Engineering

Introduction to Software Engineering

Who can detect all his errors? From hidden faults protect me.
(from Psalm 19)

Let's first ask why we should be concerned about the engineering of software. In part the answer is that there have been some notable software failures, many due to sloppy program design and implementation. To make matters worse, there is often little documentation, or poorly written documentation, so that maintenance programmers who try to fix problems with the software waste untold hours and sometimes have to simply give up.

Large computer programs are very likely to contain errors. As software engineer John Shore put it in his book The Sachertorte Algorithm (Viking, 1985), "the typical large computer program is considerably more likely to have a major, crash-resulting flaw than is the typical car, airplane, or elevator." For example, a few patients were killed by a cancer treatment machine, the Therac-25, whose faulty software on rare occasions delivered massive overdoses of radiation. Then there are reports of "stealth aircraft that crashed during testing due to software error. (See Tom Ochs's article, "Lethal Software", in the October 1992 issue of Computer Language.)

Software companies usually include disclaimers with their products stating that they will not pay for any damages caused by their software. Some companies cut corners in the rush to beat the competition in getting a product to market. Most commercial PC software is known to contain errors. Often one can search through a long list at a software company's Web site for known bugs and possible solutions. In addition, many companies commonly fix errors by sending out an "interim release" or "maintenance release" of their software.

The two major types of software errors are design errors and coding errors. A design error means that the plan created for producing the software did not include how to handle a certain situation or that the plan was incorrect. A coding error means that errors were made when the plan was transformed into a program written in a computer language. (Other types of errors include data entry errors and hardware errors. There are also the trivial syntax errors that compilers find for you.)

This situation has been termed the software crisis. In general it refers to poorly written, hard to read, error-prone software that often lacks good documentation. Companies generally spend more money on software maintenance than on any other aspect of software development. Thus it makes good sense to try to do a better job the first time. This includes taking an engineering approach that begins with an overall abstract model that is refined as more and more detail is added, careful testing of the software once it is created, and writing good documentation (both that for the end-user and that intended for other programmers). Note that testing does not guarantee correctness, however! In a large piece of software it is usually impossible to test all cases. Thus one does not know if all of the errors have yet been found. Many companies release new software when the rate at which new errors are found drops below some small arbitrary level.

The Software Development Life Cycle

This life cycle outlines a careful, engineering approach to the development of software. There are many variations on this cycle, some of which are better than others. The following, however, is fairly typical.

Analysis

This step consists of understanding the problem and developing precise specifications (of the input, output, and processing). This step may involve interviewing the intended users of the software in order to better understand their needs.

Top-down Design

Do the overall steps first, then the details. (There are rare occasions when a bottom-up approach or some other method might be used.) Decisions on classes and data structures are made at this point. Typically one asks what kind of data the program will have to handle, what one will use to hold this data, and what functions (or classes) might be needed to process this data. Charts and diagrams are commonly used. For example, a data flow diagram might be produced to show the flow of data between various programs, files, etc. in a large project. One might make drawings of the classes that will be used. If procedural programming is used (that is, no user-defined classes), a structure chart might be drawn to show the overall decomposition of the problem into functions. An overall description for each function may be produced, perhaps using the "given, task, return" style recommended elsewhere in these Web pages. Rough pseudocode algorithms for the functions may be written out. (Pseudocode is a loosely-defined combination of English and programming language. It allows one to write the algorithm for a function without worrying about minor details of coding.) One may trace the execution of the algorithms by hand (desk checking) or apply mathematical proof techniques to try to verify the correctness of the algorithms.

Prototyping

A prototype stage might be included at this point. A prototype is a rough working version that the developer and perhaps the users can try out. In fact, it has been suggested that if you don't deliberately create a prototype, your program will not be quite what you want, so that you will have to begin again anyway and your first attempt becomes a prototype after all. The prototype may leave out a number of features and may even not try to compute correct answers. Part of the goal is to allow the potential users to see the data input screens and output screens, as well as to get a sense for the overall feel of the program. (See pages 69 and 70 of Data Structures and Program Design in C, 2nd ed., by Robert Kruse, C. L. Tondo, and Bruce Leung, Prentice-Hall 1997.)

Coding

This refers to writing the complete program out in C++. Too many beginning programmers try to start with this step. For anything non-trivial this is a formula for disaster, or at least for wasting a lot of time! This is the step where one worries about all of the details of C++ programming. If good design work has already been done, the coding stage can be relatively easy.

Testing and Debugging

Test as many different cases as possible. With large software, however, it is probably not possible to test all paths through the software. Test boundary values (such as grades of 0, 60, 70, 80, 90, and 100 in a grading program) and invalid data (such as grades of -1 and 101 in the same grading program) in particular. Modular testing is helpful. That is, test and validate each function separately, often by writing a test program that simply calls this one function. Stubs can be used for subsidiary functions so that the main function can be tested by itself. (A stub is a function that contains no code or something trivial like an output statement to say that the function has been called. A stub may also assign some nonsense values to any answers that the function is supposed to produce. Essentially the stub is a function whose real code has not yet been written.) Another good way to check a program is to include output commands at strategic spots to write out intermediate values. The debugger can also assist in finding where something is getting a wrong value. Be sure that you know how to use it.

Maintenance

This involves fixing errors that show up after the software is in use, adding new features, changing to faster algorithms, etc. This is where companies tend to waste a lot of time and money, mostly because the previous steps where not done as well as they should have been or were not well documented.

Documentation

This step coincides with all the other steps; it should not be done last. At each stage of the software development cycle, important information is written down. Roughly speaking, documentation can be divided up as follows:

For users:
- tutorial
- reference manual
For the programming team and maintenance programmers:
- internal documentation (the given, task, return style is recommended)
- external documentation
  - specifications
  - pseudocode algorithms
  - various object-oriented design diagrams
  - structure chart
  - data flow diagram
  - record of testing
  - record of maintenance changes

Note that the software life cycle is not really a sequence of steps. It is a cycle in that at any step you may realize that you need to go back to an earlier step due to an error or incomplete information. Do not expect to go through the steps in order and be done. Expect to have to go back to previous steps, to have to make changes to what you have already done, etc.

Note too that in developing large programs, one person would probably not do all of these steps. The first two might be done by a systems analyst. Coding might be done by a team of programmers, where each programmer is responsible for certain classes and/or functions (hence the need for good documentation even at this point to aid in coordination). There may be a separate testing team and a separate maintenance group. There may also be a program librarian when there are a large number of source code and other files involved, specialized people to write documentation, etc.

Note that software users typically expect more than mere correctness of the programs that they use. They also want the software to be easy to use. This can be handled by developing a clear, consistent user interface, by including code to handle bad input data, etc.

Example

Let's do a simple example to illustrate the use of the software development life cycle. In other portions of these Web pages we will do more complex examples.

Suppose that we want an interactive program to help in figuring costs of purchases. Specifically, given the cost of an item and the quantity purchased, the program should calculate the total cost. Then it should allow us to figure out the cost of another purchase in the same manner, etc. At the end of a string of purchases the grand total cost of all of the purchases should be produced.

The previous paragraph gives a rough description of what the software should do. If we can nail down better the inputs and outputs we will have the analysis step pretty much completed. Suppose that we have the program prompt the user for the quantity and unit cost, then print the total cost, and ask the user whether or not to do another. If a yes answer, indicated by 'y', is given then the process should be repeated. If a no answer, indicated by an 'n', is given then the grand total cost of all purchases should be printed and then the program should end. This sums up the inputs, the processing, and the outputs.

In the design step, let's focus for a moment on the data. There is no need to save a lot of data. The only item to be kept track of throughout the program is the grand total of all of the costs. That can be kept in a simple float variable. The quantity, unit cost, and cost of an individual purchase can all be stored in simple variables. All will be floats except for the quantity, which can be an integer.

Now let's focus on the processing. This will help us to figure out what functions to use. The one chunk of processing that gets used over and over is that concerned with making one purchase. So let's consider making a function to handle this, perhaps called OnePurchase. Since there is little else in this program, that is about all that we need other than a main function. The OnePurchase function will need to receive the current grand total and will update it and send it back out. A reference parameter will then be used for the grand total. The function will need to ask the user for the quantity and unit cost, so we will use local variables for these. We can now write out the documentation on our proposed function as follows:


/* Given:  GrandTotal  The total cost of all purchases made thus far.
   Task:   To prompt the user for the info needed to make one purchase,
           printing the cost of the purchase on the screen, and updating
           GrandTotal.
   Return: GrandTotal  The updated total cost of all purchases made.
*/

We might then write out pseudocode for what the function is to do, but this example is so simple that it is not worth the effort. We might also draw a structure chart to show what function calls what other function(s), but our program has OnePurchase called from main, and that is it. Thus that drawing is also not worth writing down in this case.

Let's proceed to the coding step. Our first attempt gives us the purchase.cpp program. We remove any syntax errors and then test the program to see that it produces correct output. (It appears that this program does so.)

At this point we put the program into use and the maintenance phase begins. You might think that a simple program like this has no need of maintenance. Probably there are no errors in the program. However, the users soon start complaining that it is easy to make mistakes with the program because it doesn't check that the numbers entered are reasonable. For example, the user could enter zero or even a negative number for the quantity or unit cost. Essentially we are being asked to add new features to the software. In this case, what is needed is input checking. We should add input-checking while loops to force the user to enter positive numbers for quantity and unit price. This is left as an exercise for the reader.

Even this might not be enough. Maybe some users are horrible typists and even make mistakes with the y/n question about whether to do another purchase. So we might need to do input checking there as well. Also, the user might enter a 'y' or 'n' by mistake when a number should have been entered. That would cause our current version of the program to do strange things. (Try it.) Thus we might need to add code to check for this. (You will learn later that there is a fail function that can be used to tell you if the last input operation failed. We might be able to use it to detect such a problem. Whenever the problem occurs we could print an error message and prompt for a new value.)

What about the documentation "step" that takes place throughout all of the other steps of the software development life cycle? Well, we obviously have internal documentation inside of purchase.cpp itself. There is the comment section at the top of the file describing the overall operation of the program, and there is the comment section for the OnePurchase function. We also have external documentation in the form of an analysis and design document, essentially the paragraphs above on that subject. We may also have a record of testing and a record of any changes that were made (such as the addition of input checking).

Ethical Issues

There are ethical issues in software development. One good, well-known resource in this regard is the ACM Code of Ethics and Professional Conduct. We might well ask questions such as "Is it ethical and reasonable to produce and market a software product that you know has serious flaws?" and "Is it ethical and reasonable to develop a computer system that harvests private data from the users and then makes that private data available to others so as to produce a profit?" One code of ethics for software engineering has a principle that partly answers both of these questions: "Approve software only if they have a well-founded belief that it is safe, meets specifications, passes appropriate tests, and does not diminish quality of life, diminish privacy or harm the environment. The ultimate effect of the work should be to the public good." In regard to privacy, the first code of ethics has a lot to say, including the following:

Computing professionals should establish transparent policies and procedures that allow individuals to understand what data is being collected and how it is being used, to give informed consent for automatic data collection, and to review, obtain, correct inaccuracies in, and delete their personal data.

Legal Issues

The disclaimer is a legal document used by software developers to protect themselves from liability. They realize that even the most carefully designed and tested software may contain errors. The disclaimer, which you might have to accept by clicking something when installing the software, essentially says that the software may contain errors and that the users assume the risk in using the software. It attempts to free the software developer of any responsibility for the failure of the software and for any damages that might result. Without such protection software developers would be legally responsible for almost anything their software does or fails to do.

However, the courts have held software developers responsible for their software especially in cases where injury to people results from software errors. (Medical software is one example of where this holds.) According to the Uniform Commercial Code, the software vendor can also be held accountable for software that does not meet its intended purpose in a significant way. Still, authors of programs that have no possibility of causing serious harm to people and that essentially fulfill their stated purpose are relatively safe from legal liability provided that a disclaimer is included with the software. See pages 331 to 342 of the text Fundamentals of Computing I, by Allen B. Tucker, W. James Bradley, Robert D. Cupper, and David K. Garnick (McGraw-Hill, 1992) for a more complete discussion of this issue.

A separate legal issue involves the use of the term engineering. In most states, professional engineering organizations object to the term software engineer. This is because part of their job is to govern who may and may not use the title of engineer. In February 1998, the Texas State Board of Professional Engineers, announced its intention to recognize the discipline of software engineering. On June 17, 1998 the board formally approved this. This means that, to the best of our knowledge, Texas was the first state to license software engineers. This action, too, is somewhat controversial. The Association for Computing Machinery (ACM), for example, opposed the licensing of software engineers, in part because they believed that it would not effectively answer the problems of flaws in software.

Abstract Data Types

One of the tools sometimes used in the design stage of the software development life cycle is the abstract data type, or ADT for short. An ADT for a data type consists of the items of that type together with the operations that can be performed on those items. In one simple example the items might be fractions and the operations add, subtract, multiply, divide, print, etc. This is an abstract data model since we are not yet concerned about programming language details. Thus the ADT is very appropriate for the design stage.

Eventually, of course, one needs to implement the data type in a particular programming language. This is where details of a programming language, limitations of a particular computer (such as the range of integer values supported), etc. all become important. Several different implementations may be possible for the same ADT. This leads us to ask which is more efficient, which is easier to create, etc. We may even use an easy-to-write but inefficient implementation to create a prototype of the desired software and later replace it by a more efficient implementation for the final software. Well-written programs allow one to replace one implementation of an ADT by another implementation without changing the rest of the program. In C++ one would typically use a class to implement an ADT. For the above example we would create a Fractions class. Thus abstract data types lead naturally into the study of classes and object-oriented programming.

Note that the items of an ADT are items of a particular data type and the operations are implemented as functions (or operators). For example, in the fractions case, the fraction objects would all be objects with the same kinds of data fields: numerator and denominator. The operations (add, subtract, print, etc) would be implemented as C++ functions or operators. Thus an abstraction becomes instantiated in a real programming language.

Programs that use an ADT should only access the variables of this type though the operations provided by the ADT. (For example, in the fractions example, one could have the main program access the numerator and denominator directly (by making them public fields), maybe to print them out. This would be very poor practice, however. Instead, the numerator and denominator should be private fields and the public print function should be used to print a fraction object.) Be careful to follow this rule. Beginning programmers often stumble on this one!

An implementation of an ADT should provide an interface entirely consistent with the operations specified in the definition of the ADT. (This allows one implementation of an ADT to be easily replaced by another.) Note that the class declaration essentially gives the definition of the ADT. The class implementation can vary even while keeping the same class declaration. You will learn the details of how to create classes later when studying object-oriented programming.

Author: Br. David Carlson with contributions by Br. Isidore Minerd
Last updated: March 02, 2021
Disclaimer


Computing & Information Systems Department		Search CIS Site Tutorials