2001 Proceedings details
Title: Communicating Process Architectures 2001
Editors: Alan G. Chalmers, Majid Mirmehdi, Henk Muller
Publisher: IOS Press, Amsterdam
ISBN: 1 58603 202 X
Papers:
Title: |
Efficient Execution of Process Networks
|
Authors: |
T. Basten, J. Hoogerbrugge |
Abstract: |
Kahn
process networks (KPNs) [1] are a popular modeling technique for media-
and signal-processing applications. A KPN makes parallelism and
commu-nication in an application explicit; thus, KPNs are a modeling
paradigm that is very suitable for multi-processor architectures. We
present techniques for the efficient ex-ecution of KPNs, taking into
account both execution time and memory usage. |
Bibliography: |
Web page:BibTEX,
Refer
Plain text: BibTEX,
Refer
|
Title: |
Copying, Moving and Borrowing Semantics
|
Authors: |
David May, Henk Muller |
Abstract: |
In
this paper we discuss primitives for mobilising code and
communications. We distinguish three types of semantics for mobility:
copying (where an identical copy is created remotely), moving (where
the original is destroyed), and borrowing (where the original is moved
to the target and back to where it came from at defined moments). We
discuss these semantics for mobile code and mobile channels. We have
implemented Icarus, a language that uses borrowing semantics for mobile
code (the on-statement) and moving semantics for mobile channels (first
class channels). |
Bibliography: |
Web page:BibTEX,
Refer
Plain text: BibTEX,
Refer
|
Files: | PDF |
Title: |
Parallel Genetic Algorithms to Find Near Optimal Schedules for Tasks on Multiprocessor Architectures
|
Authors: |
M. Moore |
Abstract: |
Parallel
genetic schedulers (PGS) are applied to a combinatorial optimisation
problem, the scheduling of multiple, independent, non-identical tasks.
The tasks are functionally partitioned and must be distributed over a
multicomputer or multiprocessor system. As each task completes
execution, a result message must be communicated. Communication occurs
over a shared bus. This problem is known to be NP-complete [1]. The PGS
execute on a shared memory multiprocessor system and on a simulated
SIMD torus. Schedules produced by the PGS are compared to each other,
to those found by an exponential-time optimal branch and bound
algorithm, and to those found by a linear-time opportunistic algorithm.
The PGS produce extremely accurate schedules very quickly. When the PGS
are executed with increasing numbers of processors, near linear
speedups are obtained with no decrease in the quality of the resulting
schedules. |
Bibliography: |
Web page:BibTEX,
Refer
Plain text: BibTEX,
Refer
|
Title: |
Adapted OS Link / DS Link Protocols for Use in Mutliprocessor Routing Networks
|
Authors: |
S. Triger, Brian C. O'Neill, S. Clark |
Abstract: |
Inter-processor
communications play a vital role in the performance of distributed
parallel networks. This document analyses various protocol options for
high speed, serial, inter-processor communications for use in embedded
multiprocessor systems. The protocol is used to pass information in a
fault tolerant routing network. The work has resulted in a custom
processor interface, which forms the gateway between the processing
node and the distributed communications network, building on previous
work. The role of the interface is to provide maximum hardware support
for communications between the network and the processor. This
alleviates the processor from having to oversee communications
transactions and allows it to concentrate on program execution. The
research focuses on protocol alterations aimed at improving the fault
tolerant aspects of the design when dealing with various forms of
network failure. By improving the fault tolerance of the network,
communications bottlenecks can be avoided and data throughput can be
maximised. Previously, communications across the network had used the
OS Links protocol and, experimentally, DS Links. These were analysed
and adapted to provide the current communications protocol, which is
compared with these protocols. This protocol is more efficient, and
helps to provide many features to ensure data integrity. |
Bibliography: |
Web page:BibTEX,
Refer
Plain text: BibTEX,
Refer
|
Title: |
Successes and Failures: Extending CSP
|
Authors: |
Adrian E. Lawrence |
Abstract: |
Standard
CSP, timed or untimed, does not include a general treatment of
priority, although the PRI ALT constructor is an essential part of
occam and hardware compilation languages based upon occam. This is a
revised version of the original paper which introduced CSPP, an
extension of CSP incorporating priority. CSPP is defined by a novel
denotational semantics, Acceptances, based on Successes rather than the
usual Failures. The idea is to characterise a process by what it
successfully accepts, rather than by what it refuses to do. In the
light of experience, it might better have been called 'Responses'. The
original Acceptances was exploratory, and tried to avoid constraining
the sorts of systems, particularly circuits, that could be described.
Experience has shown that it can be substantially simplified at very
little cost. A new notation makes it much easier to follow, especially
for the non specialist. This revision of the original introduction
presents the simplified CSPP while retaining most of the motivational
material. It is intended to have something of a tutorial flavour: three
other papers, are more condensed, and deal with more technical matters.
But the core semantics is common to all four. CSPP provides a rigorous
comprehensible and simple foundation for compositional
hardware-software codesign. HCSP is a further extension which includes
extra facilities needed to describe certain circuits. And a further
radical extension lifts the usual restrictions of timed CSP, and
describes continuous analogue phenomena. CSPP was first presented
informally at the Twente WoTUG--20 technical meeting. |
Bibliography: |
Web page:BibTEX,
Refer
Plain text: BibTEX,
Refer
|
Title: |
CSPP and Event Priority
|
Authors: |
Adrian E. Lawrence |
Abstract: |
CSPP
is an extension of CSP which includes priority as used in standard
occam. The occam community has often discussed whether those notions
are really adequate, a particular concern being the difficulties
associated with priority inversion. One idea is to give priority to
sets of events considered independently of the processes that perform
them. We call this 'event priority'. Event priority is static in this
presentation. But it is possible to handle dynamic priority using a
global synchronisation when the 'event priority' changes. Obviously
there is no problem for a software system with a central scheduler, but
the theory here is addressing a far wider class of systems, in
particular massively parallel, widely distributed, implemented in
either hardware or software or both. It may be that some higher level
of abstraction should replace priority: priority is a mechanism for
achieving certain properties, often relating to time and limited
resources. Here we content ourselves with finding a formal description
of a language including event-priority. |
Bibliography: |
Web page:BibTEX,
Refer
Plain text: BibTEX,
Refer
|
Title: |
Infinite Traces, Acceptances and CSPP
|
Authors: |
Adrian E. Lawrence |
Abstract: |
There
is a long standing problem when infinite traces are included in
denotational semantic models of CSP. Full models fail to be Complete
Partial Orders under refinement. This paper introduces a novel, but
entirely natural, way of representing infinite behaviour in which
refinement is a Complete Partial Order when the alphabet of events is
finite. Acceptance semantics also solves the problem of infinite
behaviour with an infinite alphabet. That requires a different
construction based on a metric space and will be described elsewhere. |
Bibliography: |
Web page:BibTEX,
Refer
Plain text: BibTEX,
Refer
|
Title: |
The `Uniform Heterogeneous Multi-Threaded' Processor Architecture
|
Authors: |
D. Towner, David May |
Abstract: |
Multi-threaded
processor architectures are capable of concurrently execut-ing multiple
threads using a shared execution resource. Two of their advantages are
their ability to hide latency within a thread, and their high execution
efficiency. Un-fortunately, single thread performance is often poor. In
this paper we present a simple model of a multi-threaded processor, and
show how an occam-like language may be compiled into fine grained
threads suitable for executing on this processor. These fine grained
threads allow all but the most serial programs to be compiled into
multiple threads. Thus, poor single thread performance is avoided by
ensuring that sufficient threads are always available, even at the
instruction level. We call this technique ‘uni-form heterogeneous
multi-threading’ (UHM). A compiler implementing UHM has been built,
along with a cycle accurate simulator of a UHM processor. We
demon-strate that the processor is capable of good performance, whilst
being simple to design and build. |
Bibliography: |
Web page:BibTEX,
Refer
Plain text: BibTEX,
Refer
|
Title: |
Event-Based Design of Concurrent Programs with Java Implementation
|
Authors: |
H. Rischel, H. Sun |
Abstract: |
A
systematic design approach to safety-critical systems is introduced by
means of the Production Cell case study. The design is documented using
CSP-style processes, which allow verifications using formal techniques,
as well as programming in Java using the JCSP library. |
Bibliography: |
Web page:BibTEX,
Refer
Plain text: BibTEX,
Refer
|
Title: |
Using Two-, Four- and Eight-Way Multiprocessors as Cluster Components
|
Authors: |
Brian Vinter, Otto J. Anshus, Tore Larsen, John Markus Bjørndalen |
Abstract: |
This
work considers the pros and cons of different size SMPs as nodes in
clusters. We investigate the Intel SMP architecture and consider the
potential of and some problems with larger node-sizes in clusters of
multiprocessors. Six applications that represent different classes of
parallel applications are developed in versions that support both
shared and distributed memory. Performance measurements are done on
three different clusters of multiprocessors, with the purpose of
identifying how the number of processors in each SMP node impacts the
cluster performance. Our results show that clusters using higher order
SMPs do not give a clear performance benefit compared to clusters using
two-way SMPs. Off the bench mark-suite of six applications, the
performance of two turn out to be independent of node-size, two show an
advantage of larger node-sizes, as much as 34 percent improvement of
eight-way nodes over a dual-system, while the remaining two show an
advantage of dual-processor nodes as big as 11 percent over an
eight-way cluster. |
Bibliography: |
Web page:BibTEX,
Refer
Plain text: BibTEX,
Refer
|
Title: |
Guarenteed Message Delivery Time on Real-Time Distributed Systems
|
Authors: |
T. -Y. Yang, G. S. Stiles |
Abstract: |
Real-time
systems require guaranteed timely delivery of messages; failure to
deliver a message on time may result in failure of the system and
possible damage to life and property. We describe here an extension and
implementation of an algorithm developed by Kandlur et al. for
guaranteed message delivery. We extend this by adding two-phase
randomized routing in the channel establishment procedure; this scheme
requires each route to go first to a randomly chosen intermediate node,
and only then to its actual destination. This scheme balances the load
demonstrably better than direct routing, with respect to the likelihood
of acceptance of each individual channel. Given certain constraints on
the generation and size of messages, it is possible to schedule those
messages such that messages arrive within their desired deadlines.
Experiments on an 8-node network demonstrate the feasibility of the
approach, and provide verification of Kandlur's algorithm. |
Bibliography: |
Web page:BibTEX,
Refer
Plain text: BibTEX,
Refer
|
Title: |
A Programming Language for Hardware/Software Co-Design
|
Authors: |
D. R. Watt, David May |
Abstract: |
We
have developed a programming language that allows programs to be
expressed as single specifications in which any number of processes may
be tagged for hardware compilation and the rest are compiled into
software. We introduce a number of novel transformations that may be
arbitrarily applied to an occam process in order to decompose it into
two semantically equivalent concurrent processes. Our compiler targets
hardware by compiling one of these processes into a field programmable
gate array and the other into x86 object code. Furthermore, the
compiler integrates a specialised communications protocol between the
two programs that consists of a full-duplex channel implementation,
multiplexor and buffers that are dependent on the program structure and
that guarantee all external communications are free from deadlock. We
demonstrate the elegance of our language and the power of our compiler
on a small benchmark program. |
Bibliography: |
Web page:BibTEX,
Refer
Plain text: BibTEX,
Refer
|
Title: |
A Reconfigurable Host Interconnection Scheme for Occam-Based Field Programmable Gate Arrays
|
Authors: |
Roger M. A. Peel |
Abstract: |
This
paper reports on the development of an interconnection scheme for
field-programmable gate arrays (FPGAs). These FPGAs may be programmed
in the Occam parallel programming language. Now, not only may the
inter-process communication channels provided by Occam be used on-chip,
but they may also be extended to a host processor using the ubiquitous
Universal Serial Bus (USB). Bidirectional channels of BYTEs are carried
along this bus to a host processor (running Linux) where they are
presented to application code using a device driver that provides
similar capabilities to the standard B004 card link driver. A
unidirectional end-to-end throughput between Linux processes and FPGA
processes, across USB, has been measured as high as 1025 kbytes/sec,
although this rate is only achieved in favourable circumstances.
Similarly, 410 kbytes/sec may be transferred in both directions
simultaneously. Unidirectional transmission rates of more than 600
kbytes/sec, and bidirectional rates of 175-300 kbytes/sec in each
direction may be achieved in a wide range of circumstances. The paper
presents a range of performance figures, explaining which are limited
by the underlying characteristics of the USB bus and which are caused
by the current implementation. By implementing a transputer OS-Link in
the FPGA, it is possible for a USB- enabled computer to communicate
with a network of transputers, providing a convenient - and potentially
faster - alternative to previous methods. |
Bibliography: |
Web page:BibTEX,
Refer
Plain text: BibTEX,
Refer
|
Title: |
A 40 Gbit/s Network Processor Design Platform
|
Authors: |
R. McConnell, P. Winser |
Abstract: |
As
the Internet evolves, the rapidly increasing demand for bandwidth is
matched by a greater need for more intelligence with which to manage
and meter the flow of that data to sustain economic growth.
Conventional processing architectures and hardwired point solutions are
not suited to these conflicting demands; there is an emerging need for
a new approach for this data flow processing problem. This paper
presents ClearSpeed's integrated Network Processor design platform that
embodies many different levels of parallel processing. Designed to
balance the bandwidth needs with programmability we introduce the MTAP
architecture. An area and power efficient, fine-grained, scalable,
multi-threaded parallel processor, designed with a 'bandwidth-centric'
architecture and programmed in C. Based on the ClearConnect™ bus, an
SoC communication architecture with VCI compliant interfaces, a
high-bandwidth system architecture including a number of hardware
accelerator units is also described. An example 40Gbit/s programmable
and scalable classifier/forwarder is presented, embodying the concepts
of the platform. To complete the picture, a comprehensive suite of
software and hardware development tools is described. |
Bibliography: |
Web page:BibTEX,
Refer
Plain text: BibTEX,
Refer
|
Title: |
Working Towards the Agreement Problem Protocol Verification Environment
|
Authors: |
James S. Pascoe, Roger J. Loader, V. S. Sunderam |
Abstract: |
This
paper proposes the Agreement Problem Protocol Verification Environment
(APPROVE) for the automated formal verification of novel solutions to
agreement problems in group communication systems. Agreement problems
are characterized by the need for a group of processes to agree on a
proposed value and are exemplified by group membership, consensus and
fault-tolerance scenarios. Due to their fundamental role, it is
important that the correctness of new agreement algorithms be verified
formally. In the past, the application of manual proof methods has been
met with varying degrees of success, suggesting the need for a less
error prone automated approach. An observation concerning previous
proofs is that often a significant amount of effort is invested in
modeling themes common to all such proofs, albeit using different
formalisms. Thus, the APPROVE project aims to address these issues, its
envisaged culmination being a usable software framework that exploits
model re-use wherever possible. |
Bibliography: |
Web page:BibTEX,
Refer
Plain text: BibTEX,
Refer
|
Title: |
Working towards a successor to occam
|
Authors: |
Ian R. East |
Abstract: |
occam
[1] offers features and attributes that make it unique among
programming languages, particularly in the ease and security with which
one may program concurrency. After a brief summary of occam's
strengths, possible additional features are discussed, including
recursion, source code modularity, exception response, and the
automatic avoidance of deadlock. Consideration is then given to the
inclusion of passive ('data') objects and the possibility of their
movement between processes. Transfer primitives are proposed, alongside
assignment and communication. Discussion is presented with regard to
the potential for a new programming language, building on occam, while
preserving its security and simplicity. |
Bibliography: |
Web page:BibTEX,
Refer
Plain text: BibTEX,
Refer
|
Title: |
Mobile Data, Dynamic Allocation and Zero Aliasing: An occam Experiment
|
Authors: |
Frederick R. M. Barnes, Peter H. Welch |
Abstract: |
Traditional
imperative languages (such as C) and modern object-oriented languages
are plagued by uncontrolled resource aliasing problems. Add in
concurrency and the problems compound exponentially. Improperly
synchronised access to shared (i.e. aliased) resources leads to
problems of race-hazard, deadlock, livelock and starvation. This paper
describes the binding into occam (a concurrent processing language
based on CSP) of a secure, dynamic and efficient way of sharing data
between parallel processes with minimal synchronisation overheads. The
key new facilities provided are: a movement semantics for assignment
and communication, strict zero-aliasing, apparently dynamic memory
allocation and automatic zero-or-very-small-unit-time garbage
collection. The implementation of this mechanism is also presented,
along with some initial performance figures (e.g. 80ns for mobile
communication on an 800 MHz Pentium 3). With occam becoming available
on a variety of microprocessors for GUI building, internet services and
small-memory-footprint embedded products, these capabilities are
timely. Lessons are drawn for concurrency back in OO languages - and
especially for the JCSP (CSP for Java) package library. |
Bibliography: |
Web page:BibTEX,
Refer
Plain text: BibTEX,
Refer
|
Files: | PS, PDF |
Title: |
tranx86 -- An Optimising ETC to IA32 Translator
|
Authors: |
Frederick R. M. Barnes |
Abstract: |
This
paper describes tranx86, a program which converts Extended Transputer
Code (ETC) from a modified Inmos occam compiler, into IA32 code for
execution on the Intel i386 family of processors within the KRoC/Linux
system. Several optimisations are employed in an attempt to maximise
performance on this family of processors, including optimisations in
the CCSP run-time kernel. These include a graph-colouring type register
allocation scheme and various inlining of code. While tranx86 is mostly
architecture dependent, effort has been made to allow the use of
arbitrary schedulers, although currently CCSP is the only fully
supported one. Various benchmark programs are used to compare the
performance of this translator with the old system, giving significant
time wins in some cases. For the commstime benchmark program on an 800
MHz Pentium-3, the old KRoC/Linux system gave 233 ns per communication
(2 context switches); the new one, with optimisations and inlining,
gives 67 ns per communication -- more than a 3-fold reduction in
overheads. |
Bibliography: |
Web page:BibTEX,
Refer
Plain text: BibTEX,
Refer
|
Files: | PS, PDF |
Title: |
From Safe Concurrent Processes to Process-Classes? PLUSSING New Code by ROLLING out and Compile?
|
Authors: |
Øyvind Teig |
Abstract: |
This
article expands a concurrent language to support implementation
inheritance by making block structures of the super process-class
pluggable, and to interface inheritance by making the language's
protocol inheritable. The parallel 'object-based' concurrent language
occam 2 has been used as a catalyst for the concept, causing the
language in fact to become (almost?) 'object-oriented' (OO). The result
is white-box reuse between a 'process-class' and its sub process-class,
and black-box reuse seen from the client side. Since occam is aliasing
free and considered a 'safe' concurrent language, the expansion we
discuss here keeps those properties - somewhat unusual for an OO
system. This new language should be well suited in safety critical
systems, since it has inherited the static (compile-time) and
analysable properties from occam proper. Basically, two new keywords
are defined: PLUSSING and ROLLING. The language feature suggestion is
on sketch level only and therefore not complete, no BNF description
exists and no compiler has been written. |
Bibliography: |
Web page:BibTEX,
Refer
Plain text: BibTEX,
Refer
|
Title: |
CHANnels to Deliver Memory? MOBILE Structures and ALTing over Memory?
|
Authors: |
Øyvind Teig |
Abstract: |
Memory
objects are assigned to processes over a CHANnel like construct. This
way one can wait for an object indefinitely, or with timeout in an ALT
construct - coexisting with CHANnel inputs. The run-time SYSTEM will
handle requests. Alternatively, a user memory handler process may use
the underlying SYSTEM and serve other clients. Occam 2 is used as
catalyst language. |
Bibliography: |
Web page:BibTEX,
Refer
Plain text: BibTEX,
Refer
|
Title: |
Protocol Verification in Millipede
|
Authors: |
Jan B. Pedersen, A. Wagner |
Abstract: |
In
this paper we present the MOPED module of the Millipede debugging
system. Millipede is a multi-level debugging sytem for parallel message
passing programs. MOPED allows the user to specify a protocol to which
the communication of the program should adhere, and authomatically have
all the messages sent in the system checked against the protocol. The
specification language is small and easy to use, yet powerful enough to
specify a wide range of protocols. Program variables can be passed
easily to the verification module, allowing the construction of mode
dynamic protocol specifications. Protocols can be specified
incrementally, starting out very general working towards a more complex
specification. Finally, the verification module can be run either
online, that is, while the application is executing, or offline, using
log files generated when the application was executed. |
Bibliography: |
Web page:BibTEX,
Refer
Plain text: BibTEX,
Refer
|
Title: |
Towards a Viable Alternative to OO -- Extending the occam/CSP Programming Model
|
Authors: |
Tom Locke |
Abstract: |
Object
orientation has become the de facto standard for large scale, general
purpose software engineering. In this paper various aspects of object
orientation that are against good software engineering practice are
highlighted. It is then argued that a communicating process model
provides a better platform for component based programming without the
discussed pitfalls of OO. At the same time, current CSP based
programming technology is shown to be seriously lacking when measured
against certain aspects of object oriented languages. This paper is
chiefly a discussion of ideas, ideas about extensions to the occam/CSP
programming model that could advance the paradigm to the point where it
provides a viable alternative to object orientation for general
purpose, large scale software engineering. Specifically, three ideas
are discussed: mobile processes, polymorphism and routable variant
channels. |
Bibliography: |
Web page:BibTEX,
Refer
Plain text: BibTEX,
Refer
|
|