Loading...
Please wait, while we are loading the content...
Similar Documents
Design Patterns for Real-Time Software
| Content Provider | Semantic Scholar |
|---|---|
| Author | Gullekson, Garth Selic, Bran |
| Copyright Year | 1996 |
| Abstract | Design patterns are practical ways of solving common software design problems. This paper describes fundamental design patterns that apply to all real-time systems, specifically approaches to clearly separating the functionality of the software from its “control” mechanisms (that is, those aspects that deal with system initialization or shutdown, failure detection and recovery, etc.). With this approach, consistent control policies can be defined and modified independently of the primary functionality of the system. This results in reduced software complexity and increased developer productivity. The paper demonstrates how an object-oriented approach to software design, as provided by the Real-Time Object-Oriented Modeling (ROOM) method, can be used to significantly simplify the implementation of these patterns. AUTHOR BIOGRAPHY Garth Gullekson is a Vice President and founder of ObjecTime Limited. He worked for Bell-Northern Research for 14 years, primarily in the design and management of real-time telecommunications software for voice call processing, multi-media applications, and ATM terminal adapters. He is a co-author, with Bran Selic and Paul T. Ward, of the book, “Real-Time Object-Oriented Modeling”. Garth has presented at various conferences including the inaugural Communication Design Engineering Conference. Bran Selic is the Vice President of Research & Technology at ObjecTime Limited. He has over 20 years of commercial real-time software development and management experience (in telecommunications, aerospace, and robotics), and over 8 years of experience with object-oriented design and programming. He is the principal author of the book, “Real-Time Object-Oriented Modeling”, co-authored with Garth Gullekson and Paul T. Ward (co-developer of the Ward-Mellor method). Bran has lectured at leading real-time and object-oriented conferences. INTRODUCTION Real-time software development is becoming more complex as larger and more feature-rich projects are undertaken. Because schedules are tight, developers often focus on the specified features of the system, to the detriment of control aspects such as system initialization and fault detection. Control may not be as well specified (or glamourous) as the primary system functionality, but it is critical to system reliability and maintainability. This paper describes generic design approaches to systematically address control issues. These approaches can be defined and modified independently of the specific system functionality. The first sections outline the significance of control and describe the issues that control addresses. A simple communications protocol design is described to illustrate control issues. The principles for dealing with control, and a reference model that embodies these principles, is described. The paper concludes by showing how developers can use an object-oriented approach to software design, as illustrated by the Real-Time Object-Oriented Modeling (ROOM) method, to apply these principles. In particular, developers can exploit the capabilities of encapsulation, hierarchical state machines, and inheritance to solve control problems. SIGNIFICANCE OF CONTROL Control is the set of activities and mechanisms required to define a desired operational state for a system and to sustain it in that state in the face of various disruptions (such as component failures, restarts, operator interventions, and so on). Function refers to those aspects that address the primary functionality or usage of the system (often captured in the “feature list” for the system). To illustrate the significance of control, let us examine the challenges of telecommunications software design. Telecom software is a form of real-time software where control issues are often dominant. It is natural and proper for developers to first focus on the main use of the telecom system, its information transfer issues (e.g., communication protocol design). However, there is another set of issues intimately connected to the first. For example, a telephone switch needs to be supplied with information about its customers (their telephone numbers, services required, etc.). To provide service, the switch must first be brought into an operational state (often involving complex synchronization procedures). When equipment fails, the system needs to take remedial action. Such control capabilities are not an end to themselves, but are essential to supporting the main system functions. Control addresses the “care and feeding” of the system. Because control issues address the pragmatics of making systems work, at first glance developers may consider them an implementation concern. This is not so. In the telephone switch example, the need to provide and manage customer data or the need to recover from equipment failures exists independently of any implementation. Even when control is viewed as a design rather than implementation issue, it is often considered seriously only after the basic functionality is fully designed. However, the software of any complex and highly reliable real-time system may have a very significant portion of the design ultimately devoted to control. This is especially true in highly concurrent and distributed real-time systems. Control issues are non-trivial and therefore require the same care and systematic treatment as software addressing the primary functions of the system. This paper proposes a framework for handling control, within which functional concerns are also addressed. CONTROL ACTIVITIES AND MECHANISMS Control includes the following activities and mechanisms: • On-line installation (loading) of new hardware and software • System activation (start up) and deactivation • Failure detection and recovery • Preventive maintenance • Performance monitoring and statistics gathering • Synchronization with external control systems On-line installation is typically required for systems with high availability requirements that cannot be taken out of service while the installation is being done. For example, it is particularly difficult to install new software while the old software is still running. System activation is the process of bringing a software system to its operational state. It includes systematic activities such as obtaining operational (configuration) data and synchronizing with other components before starting full operation. For example, after adding a new node into a computer communication network, the node may first have to synchronize with other nodes to obtain its routing data. Deactivation of the system or individual components also needs to be done systematically. For example, the various resources used by a component must be returned when the component is deactivated. Failure detection and recovery involve isolating the exact source of a problem and taking remedial action. This can be particularly difficult in a concurrent system since a single failure can quickly lead to many other failures. The fault isolation and recovery mechanism must then decide which of the many failures is the original source of the problem. Preventive maintenance can include very complex diagnostic procedures and background audits intended to detect problems before they become critical. A diagnostic procedure is typically invoked in response to an apparent problem. An audit runs periodically to ensure that the system is in a consistent state. Performance monitoring and statistics gathering are a form of status feedback to some higher-level control systems (automated or human). This data is used to detect if a system is in some type of load imbalance such as overload. Finally, synchronization with external control systems is required if the system is part of a larger system. Distributed systems often have complex synchronization requirements. AN EXAMPLE OF CONTROL ISSUES A simple communications protocol, the alternating bit protocol as detailed in [PROTOCOL], illustrates some of the above control issues. The basic protocol is described by the message sequence chart in Figure 1. Input packets from a Client are sent to a Sender component. The Sender sends this packet to a Receiver (as pkt1), and expects an acknowledgment (ack1) before another packet (pkt0) will be sent to provide a flow control mechanism. The Client is informed of each acknowledgment. The Receiver component delivers packets to the destination Client, and expects the Client to acknowledge them. This simple protocol will resend a packet if an acknowledgment is not received within some timeout. It is also able to handle the potential duplicate reception of packets as a result of such a retry. A common way of designing this protocol is through a pair of simple communicating finite-state machines as shown in Figure 2. Figure 1. Alternating Bit Protocol Message Sequence Chart Packet |
| File Format | PDF HTM / HTML |
| Alternate Webpage(s) | http://www.objectime.com/otl/products/patterns.pdf |
| Alternate Webpage(s) | http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.41.7137&rep=rep1&type=pdf |
| Language | English |
| Access Restriction | Open |
| Content Type | Text |
| Resource Type | Article |