Cherry Owen
Ph.D. Student
Texas Tech University
owen_c@utpb.edu
Evaluating the Infrastructure of Software Applications

Keywords
Infrastructure, interaction, filters, exception handling, return codes, COTS

INTRODUCTION AND THEORETICAL BACKGROUND
With the rapid increase in distributed computer applications, the interactions among components making up the applications are becoming increasingly difficult to manage.  Even more difficult is managing the interactions between the end user applications and the infrastructure on which they depend.  Software problems can result from defects in the end user software, but can also result from defects in the infrastructure software.  The infrastructure defects can cause unexpected messages to be sent as input to the user programs.  Most developers focus on user requirements and corresponding inputs and outputs of the application.  However, very little consideration is given to the specific inputs to the application that come from the operating system, network applications, and other parts of the infrastructure.  In other words, many developers just assume the infrastructure will always work perfectly and do not expect or plan for unusual behavior or interference from non-human inputs [1].

The purpose of this research is to handle “hidden” interfaces better during development.  In particular,  interfaces between user applications and the infrastructure applications will be studied.  Since the infrastructure is built specifically to hide the cumbersome details of managing the operating system and networking functions, the developers of the user programs do not usually know all of these details.  They must depend heavily on the documentation to understand the interface to the operating system and other system software.  When rules of the interface are violated, problems will occur.  However, it is not easy to provide a well-defined interface and at the same time hide details of the operating system.  Sometimes the hidden details are important to understanding the interaction.

The plan is to identify a specific class of interface problems complicated by information hiding, and then write filters to address these problems.  The filters should make interfacing with the infrastructure more manageable.

PREVIOUS RESEARCH IN THE AREA
In theory, software development productivity can be significantly increased by using commercial off-the-shelf products [2].  The common belief is that writing the interfaces to integrate the COTS products is much less time consuming than writing the application from scratch.  However, in reality, many reuse projects have huge cost, effort and schedule overruns.  One of the major problems with reusing software is the ability to interface with the software that is already written.  The reusable components might be developed by different vendors and might not always be designed to interoperate with each other.  They can have conflicts in use of resources such as shared memory and registry entries.  The interface documentation might not accurately show all of the hidden interfaces.  Resulting conflicts in the use of hidden resources can cause numerous hours of debugging by developers and numerous hours of problem solving and unnecessary rework by users.  Since the operating system and other infrastructure software consist of many COTS products, interfacing with the infrastructure is a primary target for improvement.

In a much-publicized method of handling interfacing, connectors are used to connect the components [3][4].  In theory, the connectors will contain all of the code for interfacing and each component will only have to interface to one or more “known” connectors.  A component should never be required to interface to an unknown entity.  All of the uncertainty of interfacing to external sources should be handled by the connectors.  This solution allows much more stable components with low maintenance requirements by the changing environment. However, buggy software and hidden interfaces are not necessarily eliminated by this solution.  In practice, the defined interface between the connector and the component might not include all interfaces with the component.  For example, the component might call a system function without going through the connector.  If the system function returns an unexpected value and the component does not handle it, then problems exist.

Another methodology separates policy from the protocol implementing the policy [5][6].  This methodology is used to control interactions among autonomous components that interact through a shared network and middleware.  The policy states what the rules are and the protocol implements those rules.  The protocol can use various methods for the implementation as long as the rules are enforced.  But, for the policies to be effective they must be implemented correctly.  If undocumented inputs somehow get to a component without going through the protocol, then the methodology is less useful.

The Microsoft Visual C++ language provides a filter option to allow developers to write filters to check for return codes and throw exceptions to handle the problems specified by the return codes [7][8].  Hundreds of different error codes can be returned from the Windows operating system alone, and each infrastructure product adds to the number of possible codes.  The filter option can be used to handle some of these return codes, but it is impossible for an application program to check for every possible error.  For this reason, a thorough investigation to find the source of errors is needed to determine which error codes need to be handled to prevent the major problems.

CURRENT STATUS
The literature search has produced insight as to the problems with interfacing among modules and applications.  The Microsoft Knowledge Base [9] has been searched for known errors that might give insight into the most common problems with interfacing to the infrastructure.  The most common problems that are related to interfacing with the infrastructure have been classified into two categories: access violations and problems that cause the application to stop responding.  However, access violations can be caused by a variety of different situations.  Likewise, many different problems can cause the application to stop responding.  We now need to dig deeper into each problem in each category to find the root cause.  The root causes for the different situations will be grouped into classes.  Then one or more classes of root causes will be selected for further research.  Finally, filters will be written to address the selected root causes and to make interfacing with the infrastructure more manageable.

The decision has been made to concentrate the investigation on C and C++ functions used both in the infrastructure and in application programs.  Since many operating systems and other infrastructure software are written in C/C++, the research will apply to a wide range of software environments.  An experiment is currently in progress to look at C and C++ functions and determine whether they produce expected results as specified in the documentation.  Uncaught exceptions are of particular interest, and are being found and documented to help identify the problem areas.

The problem is being worked form both ends.  At the lower programming language level, problems are being found with the interactions among functions in C and C++.  At the application level, problems are being found that cause access violations and “hanging” processes.  Now we need to work toward the middle from both ends to match the function interface problems to the bigger application problems that they can cause.  Then we can identify specific variables that can be controlled to make working with the infrastructure a more predictable activity.  The short-term solution is the use of filters to provide the control.  However, the knowledge from this research can also be used in future infrastructure improvement projects.

INTERIM CONCLUSIONS
A search of hundreds of problem reports in the Microsoft Knowledge Base [6] shows that the same problems are recurring in successive versions of infrastructure applications.  In other words a single problem sometimes applies to Microsoft Windows 98, Windows NT, Windows 2000, and Windows XP.  Even though there is often a fix for the problem for each Windows version, the fix is not always incorporated into the next version.  A service pack might fix the problem in the current version, but then the problem occurs again in a subsequent version.  The time-to-market pressures seem to prevent a thorough investigation of the problems and their relationships.  Even though most developers/testers do everything reasonably possible to produce quality software, they don't usually have enough time to dig deep enough to find the sources of all problems.  Therefore, the research community has a responsibility to investigate different classes of problems and methods of providing practical long-term solutions.

Early results in the experiment to analyze C and C++ functions have shown that at least 7% of the simple functions we looked at do not work as documented.  If these functions are used in the infrastructure, they could be sending unexpected values to the application programs.  More investigation is needed to determine whether this is in fact happening.

OPEN ISSUES
Filters will be written for a class of root causes of interface problems.  However, the exact problems and root causes have not been isolated yet.  Once the problems have been isolated, the filters need to be designed.  A choice must be made to either expand on the Microsoft filter option currently available or to design something different.

CURRENT STAGE IN THE PROGRAM OF STUDY
All course work for the Ph.D. has been done.  The qualifying exam has been successfully completed.  Admission to candidacy for the Doctorate of Computer Science in the College of Engineering at Texas Tech University has been granted.  The dissertation proposal has been presented and accepted.  However, the first three chapters of the dissertation are still in progress.  The expected date of graduation is July 2002.

WHAT CAN BE GAINED FROM PARTICIPATING IN THE DOCTORAL CONSORTIUM
Participation in the 2001 Doctoral Consortium was very helpful, and opened doors that I didn't even know existed.  The direction of my research changed, partially because of the feedback from the discussants and other participants.  I've made significant changes, and would like to attend the 2002 Doctoral Consortium to get feedback and discussion on the new ideas.

BIBLIOGRAPHIC REFERENCES

  1. Whittaker, James A., “Software's Invisible Users,” IEEE Software, May/June 2001
  2. Boehm, Barry and Abts, Chris, “COTS Integration: Plug and Pray?” IEEE Computer, January 1999, pp. 135-138.
  3. Fuentes, Lidia and Troya, Jose Maria, “Coordinating distributed components on the web: an integrated development environment,” Software—Practice and Experience, 2001: 31:209-233, John Wiley & Sons, Ltd. 2001.
  4. Rakic, Marija and Medvidovic, Nenad, “Increasing the Confidence of Off-the-Shelf Components: A Software Connector-Based Approach,” 2001 Symposium on Software Reusability (SSR 2001), Toronto, Canada, May 2001.
  5. Minsky, Naftaly H. and Ungureanu, Victoria, “Law-Governed Interaction: A coordination and Control Mechanism for Heterogeneous Distributed Systems,” ACM Transactions on Software Engineering and Methodology, Vol. 9, No. 3, July 2000, pp. 273-305.
  6. Astley, Mark; Sturman, Daniel C. and Agha, Gul A.; “Customizable Middleware for Modular Distributed Software,” Communications of the ACM, Vol. 44, No. 5, May 2001, pp. 99-107.
  7. “Writing an Exception Filter,” MSDN Library, http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vccore98/HTML/_core_writing_an_exception_filter.asp.
  8. Schmidt, Robert, “Handling Exceptions in C and C++,” MSDN Library, http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dndeepc/html/deep051099.asp.
  9. Microsoft Knowledge Base, http://support.microsoft.com/default.aspx?scid=fh;rid;kbinfo.