Software Cost Management

Software and Process Automation

Knowledge of some basic principles is required to manage something well. Management of software costs is no different. Software costs depend on the following 4 factors:

Software is Cryptic
It is very difficult to tell what a program does without studying it in detail.
There are thousands of ways to write a computer program to perform the same task.

Software can be Cluttered
It takes many lines of code to accomplish even simple tasks.
Many computer programs in common usage are more than 1 million lines of code.
There is currently over 3,500 identifiable computer programming languages in common usage.

Software is Abstract
The numerical values that a computer program calculates can represent many different physical things.
Different problems can often be solved by the same computer program. This indicates that the connection between the numbers that flow through a computer program must be made to the real world by the user. Thus, the user must navigate another layer of abstraction to correctly interpret the results.

Software is Error Prone
There is a natural rate of error inserted by the software developer when he/she enters the code into a terminal.
Developers often find it difficult to correctly interpret ambiguous software requirements. This results in computer programs that do not represent the intentions of the specifications writer or customers

1) Any software or development technique that reduces the impact of four factors above, will help manage the cost of developing your software.


Example 1 - Reducing Data:

Reducing clutter or the amount of code required to accomplish a given task:

Problem: Given a text file, which can be found here, extract all the words in the file and print a report with the words in alphabetical order, and next to each word the frequency of its occurrence in the file. The Awk script which solves this problem is 10 lines, and it can be found here. The resulting report can be found here. An equivalent C/C++ or java program might take 100 lines. Thus, common languages are busier and not less cryptic, while the awk solution is much less busy, and therefore much less costly to implement.

The fact that all this work can be done in 10 lines of Awk says that Awk is the right programming language and development tools.

Reduce clutter and crypticity, Abstractions and Errors:

See: META-DATA and Automatic Code Generation

Specifying program at higher level of abstraction lets the computer generate the software that exhibits the indicated behaviors. Reduction in costs can be from 10% to 1000% depending on the problem, and tools available.

Example: Sum the squares of the first 10 integers. The Mathcad solution can be found here. The FORTRAN solution can be found here. The FORTRAN solution is clearly more difficult to read, so Mathcad provides for solutions to mathematical problems, at a higher level, that are easier to read and therefore less expensive to maintain.

Example 2 - Automating Manual Tasks

Automate manual tasks:

Reduce clutter, manual error insertion, the necessity of dealing with massive amounts of abstraction, and crypticity! Expect impressive cost savings.

Many legacy systems have been written software in that was based old technology. The old technology is no longer supportable since the entities that developed and maintained the compilers and tools no longer exist. Moreover, newer technologies have capabilities that are not available in the old technologies. For example GUI based training menuing systems based on X Windows and Motif are often rewritten in Java Swing to enhance portability and use some of the neat Java Swing features.

Rewriting these systems from scratch can be prohibitively expensive. However, there is a solution:
It is possible to write a program which inputs the legacy, X Windows/Motif GUI software files and outputs the equivalent Java Swing files. This is accomplished by extracting information from the old X Windows/Motif code that is needed to generate the new code, and using this information to generate the equivalent Java Swing code. The converter program can also be programmed to embellish the new Java Swing code with new, more advanced features, as required.

Cost savings from the foregoing techniques can reduce a $1,000,000 job to a $50,000 job.

Example 3 - Reduce Crypticity

Reduce crypticity by making sure the developers also write an Applications Programmer’s Interface Document for the system under development. An example of an API is the API for the Java Programming Language (SE 6) which can be found here. Another example of an API is the Eclipse 3.2 API which can be found here.

Having a well maintained API makes it much easier for your developers to understand the software and get up to speed much more quickly that they would be able to do if they had to reverse engineer the software when they are first assigned a project.

Example 4 - Converting Information

Software converts information from the form it comes in to a more useful form. The first thing to do when developing a new Software System is to identify what information is available and where it is coming from. The second step is to identify what artifacts (files, programs, databases, etc.) need to be generated from the available inputs.

2) Testing

A robust test suite is necessary to avoid major problems after the first release of a software system. Testing is usually broken up into two phases 1) Unit Testing, and. 2) System Integration

The easiest place to define a set of System Integration test cases is at the system engineering level where the focus is on what functions the software is required to perform. Each of the required functions should have a one or more test cases to prove that the program meets specifications.

The best place to define Unit tests is at the code development level, as the software is developed. For java development JUnit is a good automated tool to accomplish this.

Automated testing. Testing should be as automated as possible. There are quite a few of good automated test tools, including, Load runner, Win runner, Rational Functional Tester, jUnit, jTestcase, nUnit, NCover, Emma, open source tools like Grinder, WATIR, Canoo and Fitnesse and bug tracking tools like BugZilla and Mantis.

One can also write shell (UNIX) or cmd (Windows) scripts that a) run a number of test cases saving the results in a file on the first run. The output files are then visually inspected to insure that the results are correct. For regression testing, the scripts are rerun, and the new results are compared against the saved results to make sure the new results and old results are the same. This can be done by the diff utility on UNIX or the cmp/comp utilities on windows.

Code can be instrumented with special statements like assertions in C/C++/Java that produce tracebacks of the assertions are not met. C/C++ can be instrument to by tools like Parasoft Insure++ or Rational Purify to find:

Corrupted heap and stack memory.

Use of uninitialized variables and objects.

Array and string bounds errors on heap and stack.

Use of dangling, NULL and uninitialized pointers.

All types of memory allocations and free errors or mismatches.

All types of memory leaks.

Type mismatches in global declarations, pointers, and function calls.

Varities of dead code (compile-time)


Special semantic instrumentation that verifies that properties (e.g., units, dimensions, etc.) of variables in a computer program are consistent. Examples:

If a variable expects to receive a value that has the property of "velocity in feet/second", any value that is stored in that variable should have the property of "velocity in feet/second". If it doesn't then a warning is generated.

If a computer program computes the number of loaves of bread output by a bakery on a given day, then any variables that expect to have the property of loaves of bread must be set to expressions that have the property of loaves of bread. If there is a mismatch, then a warning is generated.

The types of constraints imposed in paragraphs 1 and 2 may seem unimportant, but this approach usually catches 90% of the programming errors. These errors include both typos and logical errors.