WoTUG - The place for concurrent processes

Refer Proceedings details

%T TROS: A Real Time Kernel for a Fault\-Tolerant Multi\-Processor Computer Based on Argument Flow
%A Eric Verhulst, R. Lauwereins, R. Cuyvers, J. Peperstraete
%E J. Wexler
%B OUG\-11: Developing Transputer Applications
%X The design and realisation of a fault tolerant load
   balancing real time kernel for a multi\-transputer system is
   considered. The system will be able to recover from software
   as well as from hardware failures. This is made possible by
   applying the argument flow program organisation. Argument
   flow programs are a mixture of normal control flow at the
   lowest level and of data flow at the higher levels. Hence,
   load balancing can be executed automatically.In the next
   part, the architecture of the runtime kernel and the use of
   argument flow is considered.

%T Dynamicity through Occam and TDS
%A D. Millot, J. Vautherin
%E J. Wexler
%B OUG\-11: Developing Transputer Applications
%X A parallel program running on a parallel machine involves a
   logical network of processes and a physical network of
   processors. When both networks are known at design time, a
   static mapping of the logical network on the physical one
   can take place. Dynamicity arises when one of the networks
   is not entirely determined before execution starts. This can
   happen for instance when the logical network involves
   dynamic creation of processes, or when the physical topology
   cannot be defined at design time (this is the case in a
   multi\-user context, as the configuration a user gets is
   affected by the other users\[rs] computations, or when the
   programmer wants to develop a generic program that can be
   executed on any physical topology).When using Occam to
   design a network of processes that has to be mapped on
   transputers, things have to be decided before execution
   starts :\- the network of processes should be composed of
   an explicitly bounded number of processes due to the fact
   that, unlike CSP, Occam does not allow recursion,\- the
   topology of the transputer network should be known, for
   processes are explicitly mapped onto processors. One has to
   investigate the physical topology in order to write the
   configuration statements, and a change in the topology
   entails modifications in these statements. An application is
   therefore dedicated to a configuration. It would
   nevertheless be very attractive to design software that
   could run on an unidentified topology, trying to make the
   best of available processors. Such software has to realize a
   dynamic placement and therefore includes a phase supposed to
   investigate the configuration, load the different codes on
   the appropriate processors, then dynamically start their
   execution.Our aim is to develop suitable tools to write that
   kind of application. As a first step, we investigate the
   potentiality of Occam and TDS in this field.

%T A CASE Tool for Designing Deadlock\-Free OCCAM Programs
%A W. D. Crowe, R. Hasson, P. E. D. Strain-Clark
%E J. Wexler
%B OUG\-11: Developing Transputer Applications
%X This paper describes a Computer Aided Software Engineering
   (CASE) tool aimed at producing correct OCCAM programs. We
   use variants of CSP and OCCAM2 notation , a strong form of
   protocols on channels and standard forms for processes in
   order to lessen the combinatorial problems which arise while
   investigating deadlock. The CASE tool is graphical in nature
   and two versions of it are currently being implemented. One
   version is written in PROLOG and runs on Mackintosh
   computers while the another is written in a mixture of
   PROLOG and C and runs on Sun workstations.

%T A Deadlock Detection Tool for Occam
%A Wouter Joosen, Pierre Verbaeten
%E J. Wexler
%B OUG\-11: Developing Transputer Applications
%X A problem that occurs frequently during the development of
   parallel programs is deadlock.In the domain of transputer
   technology (and of parallel computing in general) run time
   debugging software still is a problem. This increases the
   value of verification tools based on static analysis, even
   if the functionality is sometimes limited.In this paper we
   present our approach to static analysis. The analyzer
   reduces an occam program to the relevant actions hi the
   context of this problem (communication, possibly through
   guards: PAR constructions...), and subsequently examines the
   program, reporting possible problems that could occur during
   real execution. The tool goes beyond the purpose of
   detecting deadlocks only: other infinite wait situations are
   also reported.

%T Towards a Software Architecture for Solid Modelling Systems on Processor Networks
%A J. R. Davy, Peter M. Dew, Nick Holliman, D.P. Mallon, Alan de Pennington
%E J. Wexler
%B OUG\-11: Developing Transputer Applications
%X Our experience programming transputer\-based systems for the
   visualisation of solid models has highlighted the need to
   separate the issues of program decomposition, task and data
   scheduling and low level communication. In this paper we
   propose a three level software architecture for
   multicomputer to support geometrical computations. The
   architecture takes into account current and emerging
   hardware to support communications and scheduling.

%T An Irregular Distributed Simulation Problem with a Dynamic Logical Process Structure
%A Ming Q. Xu, Stephen J. Turner, Nie Pin
%E J. Wexler
%B OUG\-11: Developing Transputer Applications
%X This paper describes a modeling problem which exhibits many
   features of more advanced distributed simulation: the
   simulation of biological population dynamics —
   "Host\-parasite interactions". It is a
   dynamical simulation in which certain species of hosts and
   parasites live, move randomly, breed and (as far as
   parasites are concerned) infect the hosts in a
   two\-dimensional ocean. Apart from its relevance to
   realistic biological studies, this simulation program does
   serve to illustrate many crucial ideas in dynamic time and
   event driven simulations. Our approach to the parallel
   implementation of this simulation requires the simulation
   objects (in this case, hosts and parasites) to be organised
   as LPs (Logical Processes) which can be created and
   destroyed dynamically at run time to reflect the birth/death
   of these simulation objects. In addition to their dynamical
   features, LPs must preserve the temporal aspects of the real
   world system. In other words, a global ordering of LP
   interactions (referred to here as actions) must be ensured
   to preserve the causality principle [1]. Also, the
   computational load can become imbalanced because the real
   world system or rather, the distribution of hosts and
   parasites in the underlying space is changing with time. The
   methods for counteracting this dynamical load imbalance will
   be described.We shall begin with the specification of the
   dynamical rules governing the behaviour of the hosts and
   parasites during the simulation.

%T A Generally Configurable Multigrid Implementation for Transputer Networks
%A Osama El-Giar, Tim Hopkins
%E J. Wexler
%B OUG\-11: Developing Transputer Applications
%X This paper describes the performance of a multigrid method
   implemented on a transputer\-based architecture. We show
   that the combination of fast floating\-point hardware, local
   memory and fast communication links between processors
   provide an excellent environment for the parallel
   implementation of multigrid algorithms. The gain in
   efficiency obtained by increasing the number of processors
   is shown to be nearly linear and comparisons are made with
   published figures for a parallel multigrid Poisson solver on
   an Intel iPSC 32\-node hypercube.

%T Self\-Adjusting Mapping: A Heuristic Mapping Algorithm for Mapping Parallel Programs onto Transputer Networks
%A Hong Shen
%E J. Wexler
%B OUG\-11: Developing Transputer Applications
%X The problem of mapping parallel programs onto multiprocessor
   system is a fundmental problem of great significance in
   parallel processing, but it is NP\-hard in general. In this
   paper we propose a fast heuristic algorithm to solve this
   problem on transputer networks. Our mapping algorithm
   consists of three modules: grouping, placement and routing,
   where grouping groups processes in the program into tasks
   which can be placed onto processors in the transputer
   network in a way of one\-to\-one, placement places the
   grouped tasks onto the processors and routing produces
   physical communication paths for logical communication
   requirements. The three modules work co\-operatively in a
   way of progressive self\-adjusting, and finally produce a
   satisfactory solution for the mapping problem.

%T The Investigation of Communications Patterns in Occam Programs
%A Rosemary Candlin, Qiangyi Luo, Neil Skilling
%E J. Wexler
%B OUG\-11: Developing Transputer Applications
%X The performance of a concurrent computation running on a
   multiprocessor system may depend critically on the way in
   which the program is decomposed and placed on the machine.
   In order to exploit the potential of parallel processors, it
   is necessary to balance the advantage of spreading the
   computational load as thinly as possible over the
   processors, with the disadvantage that increased
   communication delays may slow down the computation. In
   general, there is no satisfactory theoretical model of the
   complex interaction between the amount of computation
   carried out by the individual processes, their frequency of
   communication and the topology of the underlying machine.
   For many programs, it is not easy to see in advance how
   computation will interact with communication, and placement
   strategies which depend only on a static analysis of the
   program structure may not be sufficient. The work described
   here is an attempt to provide useful tools for the occam
   programmer which can be used to investigate communications
   patterns, and to explore different configurations rapidly.We
   think that this approach will be particularly valuable for
   programs which can be decomposed in a natural way into a
   fairly large number of top\-level occam processes, so that
   the preliminary parallelization arises out of the nature of
   the application, and the main problem is to place these
   processes on a smaller number of physical processors. This
   is often the case for programs which model real\-time
   systems, and we have taken as an example an application from
   chemical engineering. In programs like this, there is
   natural concurrency in the real world which can be easily
   represented in terms of occam processes. At the moment, we
   do not attempt to extract parallelism automatically, or
   handle shared data, though there are a number of systems
   that have tackled these problems (see, for example [1] and
   [2]). Our main aim at this stage is to provide a programmer
   with profiling tools, and see to what extent they can help
   to produce an efficient implementation of the program.

%T A System Configuration for very large Database Problems [Extended Abstract]
%A Alan G. Chalmers, Derek J. Paddon
%E J. Wexler
%B OUG\-11: Developing Transputer Applications
%X In the past many applications have ensured success by
   restricting the size of the application, or by increasing
   the number of processors and memory size to enable the full
   database to be supported. Here, we specify that databases of
   arbitrary sizes should be supported and not be restricted by
   the memory size of individual processors.The ability to cope
   with very large databases was easily achieved in many of the
   early MIMD systems by using a shared memory model. However,
   the transputer and Occam process model restricts us from
   using this approach, instead we may share data [7].Unlike
   shared memory systems, we cannot globally address data in a
   message passing system. However, if data items carry unique
   identifiers, we can share single or multiple copies of those
   data items across many processors. Indeed, adopting this
   system of shared data reference allows us the same memory
   flexibility for read\-only data, as would be obtained in a
   shared memory system, without the bus contention problems
   associated with that class of processor. In its degenerate
   form, a shared data system has only private data, which is
   never available at any other processor. The simple processor
   farm of May and Shepherd [8] is a typical example, where
   data and tasks are assigned to specific processors without
   the need for data to migrate to other processors. In many
   applications, such as the ray tracing of very complex
   computer images, a static allocation of data is
   inappropriate. Here, a database is managed at each node in a
   similare manner to a cache memory. Shared data systems for a
   tree based system architecture, and for very large data base
   problems are described by Green, Paddon and Lewis [7], and
   Green and Paddon [3, 4, 5, 6], where these systems were
   applied to image synthesis using the ray tracing method.

%T A Comparison of Parallel Implementations of Flux Corrected Transport Codes
%A Jing-ming Jong, G. S. Stiles
%E J. Wexler
%B OUG\-11: Developing Transputer Applications
%X We present the results of comparing implementations of the
   Flux Corrected Transport (FCT) method on transputers and
   several other parallel and sequential machines. FCT is a
   finite difference scheme used to solve fluid dynamics
   problem which may involve steep gradients or shocks; it has
   proven useful for both one\- and two\-dimensional problems
   in plasma physics, atmospheric sciences, and detonation
   studies. The method vectorizes very well and hence runs
   quickly on supercomputers. Since the calculations at each
   point involve only a small number of neighbors, the method
   can also be efficiently implemented on multi\-processor
   systems. We have run one\- and two\-dimensional problems on
   Transputers and several other systems, including a VAX 8650,
   a SUN 4/280, a four\-processor Ardent Titan, an
   eight\-processor Alliant FX/8, and a four\-processor Silicon
   Graphics 240GTX. We shall also compare our results to those
   obtained by Gustafson (1988) on the NCube/ten.If, in the
   1\-d problem, we consider the speed of a single T800 to be
   1.0, the SUN 4/280 ranks at 3.8, the VAX 8650 at 4.0, 8
   TSOOs at 7.9, the Silicon Graphics 240GTX at 27.0, the FX/8
   at 56.9, and the Titan at 64.4. On the 1\-d problem, again
   taking one T800 to have a speed of 1.0, the SUN comes in at
   3.6, 16 NCube nodes at 4.0, the 8650 at 4.3, 8 TSOOs at 7.7,
   the Titan at 65.3, and the FX/8 at 101.4. The transputer
   ranks highest if we calculate the cost\-effectiveness of the
   various systems by dividing the relative speed by the
   approximate cost. If we assume the 8 TSOOs have a
   cost\-effectiveness of 1.0 on the 1\-d problem, the Titan is
   second at 0.52, followed by the 240GTX at 0.17, the FX/8 at
   0.094, the SUN at 0.081, and the VAX at 0.021.

%T Simulating Neural Networks in a Distributed environments
%A Jukka Vanhala, Kimmo Kaski
%E J. Wexler
%B OUG\-11: Developing Transputer Applications
%X Two artificial neural network models are compared. They are
   the Hopfleld neural network model and the Sparse Distributed
   Memory model. Distributed algorithms for both of them are
   designed and implemented. The run time characteristics of
   the algorithms are analyzed theoretically and tested in
   practise. The storage capacities of the networks are
   compared. Implementations are done using a distributed
   multiprocessor system.

%T Attribute Evaluation on a Network of Transputers
%A Matthijs F. Kuiper, Atze Dijkstra
%E J. Wexler
%B OUG\-11: Developing Transputer Applications
%X The structure and construction of parallel compilers that
   run on a network of transputers is discussed. The compilers
   are automatically generated from an attribute grammar
   definition of the source language. This work illustrates
   that parallelism can also be used in non\-numeric
   computations. As part of implementing parallel compilers we
   have constructed a general message passing kernel for
   transputers. This kernel can also be used in other
   applications. First results indicate that parallel compiling
   on transputers is feasible and that 4 to 16 transputers can
   be used in parallel compilers.

%T An Object Oriented Style for the Computing Surface
%A Matthew Chalmers
%E J. Wexler
%B OUG\-11: Developing Transputer Applications
%X During the development of a ray tracer on a Meiko Computing
   Surface, problems with poor flexibility of configuration and
   slow software development were encountered. In order to
   overcome these difficulties, and in order to facilitate
   experimental programming on the Meiko, a system to support
   an object\-oriented style for Occam\[rs] programming was
   developed. The aim was to create a set of library modules
   that would allow user code to be quickly developed and
   integrated into existing programs, to support better
   debugging facilities than were currently available, and to
   allow program design to be based on a more flexible and
   dynamic model of concurrency than the process modelThis
   system has been rewritten in order to introduce new features
   and to take advantage of the availability of C. The new
   system is described, with the emphasis on how experience
   with the system influenced its redesign, and on the details
   of newer elements such as the improved facilities for
   monitoring and debugging.

%T C_NET A C++ Based Language For Distributed And Real Time Programming
%A Jean_Marc Adamo
%E J. Wexler
%B OUG\-11: Developing Transputer Applications
%X C_NET is a high level C++ based language devoted to
   multiprocessor architecture programming. It has been
   designed so as to offer concepts of object\-oriented
   programming, communicating processes and exception handling,
   all within the same language. The purpose of this paper is
   to describe how merging these concepts into C_NET has been
   organized. Consequently, the paper is divided into three
   parts. The first is concerned with discussing the different
   roles that the notions of class object and process are
   intended to play within the language. It is argued that
   these roles are in fact orthogonal, since the first two
   notions are primarily concerned with data structuring,
   encapsulation and inheritance, whereas the last one is
   mainly concerned with threads of control and
   synchronization. The second part is devoted to describing
   the exception handling system, which makes it possible to
   derive process preemption mecanisms by combining exceptions
   with parallelism. Process preemption raises some atomicity
   problems, which are discussed at the end of the second part.
   Finally the last part provides information on the state of
   the project development and on future perspectives.

%T Real\-Time Transputer Models of a Low\-Level Primate Vision
%A Andrew B. Smith, Peter H. Welch
%E J. Wexler
%B OUG\-11: Developing Transputer Applications
%X It is believed from psychophysical experiments that human
   vision operates in two stages: a parallel preattentive stage
   which extracts simple visual features and a sequential
   attentive stage in which local features of the scene are
   analysed. A processing model of the early preattentive stage
   has been developed. This model is computationly intensive
   making it unsuitable for implementation on sequential
   computer architectures. The development of a real\-time
   parallel transputer vision system based on this processing
   model is explained.The current implementation performs edge
   filtering over four separate resolution/field\-of\-view
   levels from 256 by 256 monochrome images. Eight T800\-20
   transputers deliver over 40 frames per second. The software
   — hardware architecture is scalable so as to support
   higher resolutions and additional features (such as
   auto\-focusing, movement detection, and tracking) through
   the addition of extra transputers, whilst maintaining at
   least camera frame rates.

%T ICR: A Transputer\-Based Intelligent Character Reader
%A Francis Wong F.S., Koh Liang Seng
%E J. Wexler
%B OUG\-11: Developing Transputer Applications
%X This paper presents the design of a transputer\-based
   Intelligent Character Reader (ICR). The implementation of
   the ICR is based on two algorithms. The Filtered Projection
   Technique (FPT) is a modification of the conventional
   projection profile techniques \- it filters out and projects
   the structural information of a visual pattern into several
   components along several directions, such that some salient
   structural information can be extracted from these
   components easily. The Structural Complexity Index (SCI)
   algorithm determines the structural complexity of a visual
   pattern and produces the corresponding index to indicate its
   relative complexity. The ICR is capable of recognizing
   printed characters, such as the Chinese characters printedon
   the local newspapers, accurately, despite the presence of
   noise due to printing, scanning, slight font variations and
   misalignment. The ICR prototype is implemented on Occam II
   and C, and run on an array of transputers.

%T Solving Partial Differential equations via Cellular Automata: A Binary and Statistical Approach
%A A. Cosnuau, F. Desbois, Y. Morchoisne
%E J. Wexler
%B OUG\-11: Developing Transputer Applications
%X Cellular automata are used to solve partial differential
   equations (PDE) discretized on a non\-regular grid. Boolean
   representations of real numbers are introduced and logical
   intrinsic computer functions are used to achieve algebraic
   operations of the finite difference algorithm. In a first
   step the method is briefly explained and then implementation
   is described. For diffusion or convection problems,
   preliminary computations on a Cray\-XMP18 and on a network
   of T800 Transputers were done.

If you have any comments on this database, including inaccuracies, requests to remove or add information, or suggestions for improvement, the WoTUG web team are happy to hear of them. We will do our best to resolve problems to everyone's satisfaction.

Copyright for the papers presented in this database normally resides with the authors; please contact them directly for more information. Addresses are normally presented in the full paper.

Pages © WoTUG, or the indicated author. All Rights Reserved.
Comments on these web pages should be addressed to: www at wotug.org

Valid HTML 4.01!