Communicating Process Architectures - 2000
Sunday 10th. September (evening) through Wednesday 13th. September (lunchtime)
Invited Talk:Concurrency in Industry (Wot, no CSPs?)
Johan P.E. Sunter (Philips TASS, Eindhoven, The Netherlands)
Abstract: Within an international company such as Philips, systems are made from the very small to the extremely large. And although parallelism is an important aspect of system architecture, it is often overlooked. Due to the wide variety of systems, parallelism presents itself in many different shapes, requiring different design approaches. What approaches have been chosen in the past, and do they have something in common? Can we learn from them how to do it better? What do we need in the future?
Theory:Formal Analysis of Concurrent Java Systems
Peter H. Welch (Computing Laboratory, University of Kent)
Jeremy M.R. Martin (Oxford Supercomputing Centre)
Abstract: Java threads are synchronised through primitives based upon monitor concepts developed in the early 1970s. The semantics of Java's primitives have only been presented in natural language -- this paper remedies this with a simple and formal CSP model. In view of the difficulties encountered in reasoning about any non-trivial interactions between Java threads, being able to perform that reasoning in a formal context (where careless errors can be highlighted by mechanical checks) should be a considerable confidence boost. Further, automated model-checking tools can be used to root out dangerous states (such as deadlock and livelock), find overlooked race hazards and prove equivalence between algorithms (e.g. between optimised and unoptimised versions).
Practice:CSP: Arriving at the CHANnel Island (an Industrial Practitioner's Diary: in Search of a New Fairway)
Oyvind Teig (Navia Maritime AS, division Autronica, Norway)
Abstract: This paper is a non-academic hands-on log of practical experiences of software engineering problems, where the process leading to the decision to use CSP to program real-time systems is on trial. A new and hopefully objective decision process is instigated. What we have previously learnt by using CSP in embedded real-time process control is used as subjective basis (or bias?) for the new process. The conclusion seems to be that CSP should be sufficiently future-proof to justify its use even in new projects. The term "CSP" is here used as shorthand for both the CSP language proper and different implementations of subsets.
Hardware Infrastructure:Synchronisation in a Multithreaded Processor
Shondip Sen, Henk Muller and David May (Computer Science, University of Bristol)
Abstract: A multithreaded architecture exploits instruction level parallelism by interleaving instructions from disjoint thread contexts. As each thread executes within its own instruction stream with private data (the context registers), there is no interdependency between instructions from different threads. This allows high resource utilisation of a super scalar pipelined processor at a very low cost, in terms of complexity and silicon area. A new synchronisation mechanism for a multithreaded architecture is outlined. Two new instructions have been introduced to perform one to one and n-way synchronisation. The operation allows synchronisations to be requested and actioned efficiently on chip in as little as four clock cycles. Barriers and CSP style channels can easily be constructed with this new synchronisation instruction. A brief examination of performance of this multithreaded architecture shows that the optimum number of contexts per multithreaded processing element is four, based on test programs.
Abstract: Multithreaded architectures have been developed as a way to hide latencies in memory access, communication, and long pipelines. Caches have been developed to hide latencies and reduce memory bandwidth requirements. Caches do not work well in multithreaded environments, because threads unintentionally evict each others data and instructions. To enable effective use of caches in a multithreaded environment (giving high execution speed even in the context of high memory latencies), we propose to use a cache architecture where the cache can be divided into partitions. Each thread is assigned a set of partitions which are used to cache a view of data structures, or part of the instruction stream. The partition assignment is completely automated in the compiler. With our compiler and architecture, all forms of interference are eliminated and predictable execution of multithreaded programs is achieved in moderately sized caches.
Abstract: This paper shows how the parallel occam code for a graphics application has been compiled to run on a Field Programmable Gate Array (FPGA). This application has stringent timing constraints, and many optimisations to the sequencing of operations and occam constructs have been employed to meet them. In particular, two crucial pipelined processes are shown to operate with no control overhead despite containing parallel and looping constructs and channel communications.
Software Infrastructure:Distributed Computing using Channel Communications in Java
A. Ripke, Alastair R. Allen and Y. Feng (Department of Engineering, University of Aberdeen, Scotland)
Abstract: Various methods of connecting CSP channels to external software systems are examined. The aim is to facilitate the implementation of distributed heterogeneous systems using channel communication. The paper concentrates on TCP/IP connections in a Java environment. The approaches used are: custom protocol; object serialization stream protocol; Remote Method Invocation (RMI); Common Object Request Broker Architecture CORBA). Practical systems are described and compared.
Abstract: This paper describes the implementation of an extended version of Linda in Java. The extensions have been made with a view to increasing the efficiency of the underlying communication mechanisms and the flexibility with which data may be accessed, particularly in the area of distributed multimedia applications. The system is compared with two other recent implementations of Linda for Java: JavaSpaces from Sun Microsystems, and TSpaces from IBM. The comparison is performed both qualitatively, comparing and contrasting the features of the systems, and quantitatively, using a simple communication benchmark program and a ray-tracing program to assess the performance and scalability of the different systems for networks of workstations.
Abstract: In this paper the behavior of conditional communication in the presence of priority will be described. In the theory of CSP, conditional communications are expressed by a (external) choice construct (also known as the alternative construct) by which the choice of one communication out of a set of communications is non-deterministic. In practice, some communication can be more important than others, so that, the choice is deterministic. Therefore, one can distinguish two prioritized alternative constructs; a symmetric alternative construct (ALT) and an asymmetric alternative construct (PRIALT). Current formal semantics and implementations of the ALT and PRIALT are based on the priorities of communication and not related to the surrounding priorities of communicating processes. This can result in a structural mismatch that can cause performance problems. In this paper a practical solution for realizing fair and unfair conditional communication in the presence of the PAR and the PRIPAR will be discussed.
Abstract: Java provides support for concurrent and parallel programming through threads, monitors, and its socket and Remote Method Invocation (RMI) classes. However, there have been many concerns expressed about the way in which this support is provided -- e.g. [Brinch-Hansen][Welch], citing problems such as an improper implementation of monitors and the difficulty of programming with threads. Hoare's Communicating Sequential Processes (CSP) model fully describes thread synchronization and is based on processes, composition and channel communication. It provides a mathematical notation for describing patterns of communication using algebraic expressions and contains formal proofs for analyzing, verifying and eliminating undesirable conditions, such as race hazards, deadlocks, livelock, and starvation. Two independent research efforts provide a CSP based process-oriented design pattern for concurrency implemented in Java: Java Communicating Sequential Processes (JCSP) and Communication Threads in Java (CTJ). In this paper, we compare these two packages, looking at the philosophy behind their development, their similarities, their differences, their performance and their use.
Abstract: This paper describes an extension to Kent Retargetable occam Compiler (KRoC), which enables the execution of a blocking call, without blocking the occam-kernel. This allows a process to make a blocking system call (eg, read, write), without blocking processes running in parallel with it. Blocking calls are implemented using Linux clones which communicate using shared memory, and synchronise using kernel level semaphores. The usefulness of this is apparent in server applications with a need to handle multiple clients simultaneously. An implementation of an occam web-server is described, which uses standard TCP sockets via an occam socket library. The web-server comes with the ability to execute CGI scripts as well as dispensing static pages, which demonstrates some level of OS process management from within occam.
Abstract: An experimental version of KRoC has been written that implements recursion in occam, using Brinch Hansen's algorithm for allocating activation records. It shows that efficient parallel recursion is possible in occam.
Abstract: A simple post-mortem debugging facility has been added to KRoC, to identify and locate run-time errors, including deadlock. It has been implemented for the SPARC/Solaris and i386/Linux versions of KRoC.
Abstract: The JCSP library provides a superior framework for building concurrent Java applications. Currently, CSP is a collection of classes that uses the standard Java Threads mechanism to provide low-level facilities such a process scheduling and synchronization. The overheads of using Java Threads can be quite large though, especially for synchronization and context switching.
Abstract: A C library is described which exists to support synchronisation and communication in the CSP model using the Posix Threads API as its basis. This differs from other notable approaches (e.g. [Moores:1999]) in that it uses Posix Threads. Whilst this approach has drawbacks, its main achievement is being easily adopted by people already familiar with Posix Threads and being useful to a wide range of target platforms.
Tools:The Automated Serialization of Concurrent CSP Scripts using Mathematica
Weiyang Zhou and G.S. Styles (Electrical and Computer Engineering, Utah State University)
Abstract: This report discusses the design and implementation of a package of Mathematica-based tools for serializing scripts describing the behavior of concurrent systems in CSP. Under some conditions, serialization can yield code that is more efficient than multi-process versions and can better meet processor constraints. Under most conditions concurrent design is easier and more reliable so we prefer to design concurrently and only implement serially when necessary. The tools include a set of definitions and procedures to automatically change concurrent code to serial code. Conversions of the following step laws have been implemented thus far: external choice, generalized parallel, and alphabetized parallel; additional laws will be implemented in the next month or so. The tools support the input of the more formal typeset notation of CSP, which makes the script writing intuitive, and can convert the typeset version to the machine-readable version required by software such as FDR. The correctness of the conversions has been checked with the FDR package. This package should be of use to both those regularly working on concurrent systems and those learning CSP.
Abstract: Conventional model-checking techniques for analysing concurrent systems for deadlock or livelock are hampered by the problem of exponential state explosion: the overall number of global states that needs to be checked may grow exponentially with the number of component processes in the system. The state-of-the-art commercial tool FDR provides deadlock and livelock analysis but it is limited at present to analysing systems of up to one hundred million global states.
Abstract: The CSP paradigm is known as a powerful concept for designing and analysing the architectural and behavioural parts of concurrent software. Although the theory of CSP is useful for mathematicians, the programming language occam has been derived from CSP that is useful for any engineering practice. Nowadays, the concept of occam/CSP can be used for almost every object-oriented programming language. This paper describes a tree-based description model and prototype tool that elevates the use of occam/CSP concepts at design level and performs code generation for the level of implementation in Java, C/C++, and machine-readable CSP. The tool is a kind of browser that is able to assist modern workbenches (like Borland Builder, Microsoft Visual C++ and 20-SIM) with coding concurrency. The tree description model can be used to browse through the generated source code. The tool will guide the user through the design trajectory using supporting messages and several semantic and syntax rule checks. The machine-readable CSP can be read by FDR, enabling more advanced analysis on the design. Early experiments with the prototype tool show that the browser concept, combined with the tree description model, enables a user-friendly way to create a design using the CSP concepts and benefits. The design tool is available from our URL, http://www.rt.el.utwente.nl/javapp.
Applications:A Self-Configuring Distributed Kernel for Satellite Networks
Scott Cannon and Larry Denys (Space Software Laboratory, Utah State University)
Abstract: The Space Software Laboratory is developing a self-configuring distributed kernel to be used on future satellite missions. The completion of this system will allow a network of heterogeneous processor nodes to communicate and broadcast in a scalable, self-configuring manner. Node applications software will be transparent to the underlying network architecture, message routing, and number of network nodes. Nodes may halt or be reset and later rejoin the network. The kernel will support in-flight programming of individual nodes.
Abstract: Computational steering is the ability to visualise the data from a computation in progress and to modify the future behaviour of the computation in response to this. It is often perceived as being something very difficult to implement, especially for parallel computations. However, given a good visualisation environment, we have found that this is not necessarily the case. We have sought to dispel this myth, using a very simple model which makes it easy to 'wire-up' an existing MPI parallel program for steering. New insights may quickly be gained by continually monitoring and guiding the progress of computational simulations, that were perhaps previously analysed only in their final state.
Abstract: This paper summarises the results of the realisation of a Cruise Control system in occam using an implementation of the Kent Retargettable occam Compiler (KRoC) on the Philips 8051 microcontroller.
Page last modified on 20th March 2001
Pages © WoTUG, or the indicated author. All Rights Reserved.
Comments on these web pages should be addressed to: www at wotug.org