WoTUG - The place for concurrent processes

CPA 2014 Programme

The CPA 2014 programme comprises two full day and one half day sessions (three Keynote presentations, refereed papers and mini-Workshops – new this year). There will also be two open Fringe sessions on Sunday and Monday evenings. The conference dinner will be on Tuesday evening.

The provisional programme is listed below. Each item in the timetable links to abstracts (below the timetable). Papers and abstracts are grouped into keynotes, accepted papers, workshops and fringe presentations. Within each group, items are ordered alphabetically by surname of first author/proposer. The workshop abstracts are really short position papers – please note that some of them link to longer versions towards the end, which those attending should read.

Details may change, so please come back to this page from time to time. Proposals for Fringe presentations may, as usual, be made right up to the start of any session. We encourage delegates to do so in advance though, so we can schedule and announce them on this page – just mail us at cpa2014@wotug.org.

Scheduling of the workshops is open to the delegates – the one given here is just an example. If you are a delegate you can express your preferences by completing the form here. Some will need to run in parallel (e.g. workshop sessions 3 and 5 in the example below). Those without a sufficient quorum of support will not run – but we trust those interested will pursue them independently.

Schedule

Sunday, August 24th, 2014
19:00 Dinner (St Anne's College)
20:30 Fringe Session 1
The Need for Prioritised Alternation in a Programming Language (Abstract)
Ian East, Open Channel Publishing Ltd., UK
T42 – Transputer Design in FPGA (Abstract)
Uwe Mielke (a), Martin Zabel (b) and Michael Bruestle (c)
(a) Electronics Engineer, Dresden, Germany
(b) Institut für Technische Informatik, Technische Universität Dresden, Germany
(c) Electronics Engineer, Vienna, Austria
Occam (Raspberry) Pi (Abstract)
Richard Miller, Miller Research Ltd., Oxford, England
21:30 Workshop Session 1
Why is CSP Needlessly Difficult to Learn and so Easy to Forget, and Why are its Semantics Upside Down? (Abstract)
Peter H. Welch, School of Computing, University of Kent, UK
23:00 Bar closes
Monday, August 25th, 2014
09:00 Registration/Welcome (Department of Computer Science)
09:30 Keynote Address 1
The COMPASS Modelling Language: Timed Semantics in UTP (Abstract) (Preprint)
Jim Woodcock (a), Jeremy Bryans (b), Samuel Canham (a) and Simon Foster (a)
(a) Department of Computer Science, University of York, UK
(b) School of Computing Science, Newcastle University, UK
10:30 Tea/coffee
Session 1
11:00 A Service-oriented Scalable Dictionary in MPI (Abstract) (Preprint)
Sarwar Alam, Humaira Kamal and Alan Wagner, Department of Computer Science, University of British Columbia, Canada
11:30 FDR into The Cloud (Abstract) (Preprint)
Thomas Gibson-Robinson and Bill Roscoe, Department of Computer Science, University of Oxford, UK
12:00 Process Discovery in Highly Parallel Distributed Systems (Abstract) (Preprint)
Jon Kerridge, School of Computing, Edinburgh Napier University, UK
12:30 Lunch
Session 2
14:00 Hardware Ports - Getting Rid of Sandboxed Modelled Software (Abstract) (Preprint)
Maarten M. Bezemer and Jan F. Broenink, Robotics and Mechatronics, CTIT Institute, University of Twente, The Netherlands
14:30 Performance of Periodic Real-time Processes: a Vertex-Removing Synchronised Graph Product (Abstract) (Preprint)
Antoon H. Boode and Jan F. Broenink, Robotics and Mechatronics, CTIT Institute, University of Twente, The Netherlands
15:00 Towards Millions of Processes on the JVM (Abstract) (Preprint)
Jan Bækgaard Pedersen and Andreas Stefik, Department of Computer Science, University of Nevada, Las Vegas, USA
15:30 Tea/coffee
16:00 Workshop Session 2
Fine-Grain MPI for Extreme-Scale Programming (Abstract)
Alan Wagner, Department of Computer Science, University of British Columbia, Canada
18:00 Workshop ends
19:00 Dinner (St Anne's College)
20:30 Fringe Session 2
CSP In Orbit - a Recent Satellite Application (Abstract)
Roger Peel, Barry Cook and Paul Walker, 4Links Ltd., UK
Adding Concurrency to an Evidence-based Programming Language (Abstract)
Andreas Stefik and Jan Bækgaard Pedersen, Department of Computer Science, University of Nevada, Las Vegas, USA
A Real-time Garbage Collection Technique for Concurrent Flash Storage Architecture (Abstract)
Muhammed Ziya Komsul and Alistair A. McEwan, Department of Engineering, University of Leicester, UK
A Personal Perspective on The State of Parallel Processing on the Transputer's 30th Anniversary (Abstract)
Ruth Ivimey-Cook, eLifeSciences Publications Ltd., Cambridge, UK
23:00 Bar closes
Tuesday, August 26th, 2014
09:00 Keynote Address 2
De-skilling CSP: Why FDR Needs Lazy Compilation (Abstract) (Preprint)
Bill Roscoe, Department of Computer Science, University of Oxford, UK
10:00 Tea/coffee
10:30 Workshop Session 3
Communicating Process Architectures - by Example (Abstract)
Jan F. Broenink, Robotics and Mechatronics, CTIT Institute, University of Twente, The Netherlands
Making JCSP Faster: Re-implement with Atomic Operations and Verify its Correctness (Abstract)
Peter H. Welch, School of Computing, University of Kent, UK
12:30 Lunch
Session 3
14:00 Synchronous Message Exchange for Hardware Designs (Abstract) (Preprint)
Kenneth Skovhede and Brian Vinter, Niels Bohr Institute, University of Copenhagen, Denmark
14:30 N Queens on an FPGA: Mathematics, Programming, or Both? (Abstract) (Preprint)
Jan Kuper and Rinse Wester, Computer Architecture for Embedded Systems, CTIT Institute, University of Twente, The Netherlands
15:00 Deriving Stencil Hardware Accelerators from a Single Higher-Order Function (Abstract) (Preprint)
Rinse Wester and Jan Kuper, Computer Architecture for Embedded Systems, CTIT Institute, University of Twente, The Netherlands
15:30 Tea/coffee
16:00 Workshop Session 4
Communicating Process Architectures: a Personal View of the Past, Present and Future (Abstract)
Barry Cook, 4Links Ltd., UK
18:00 Workshop ends
18:45 Conference dinner (University College)
Wednesday, August 27th, 2014
09:00 Keynote Address 3
Parallel Systems from 1979 to 2014: 35 Years of Progress? (Abstract)
Roger Shepherd, Chipless Ltd., Bristol, UK
10:00 Tea/coffee
10:30 Workshop Session 5
CSP mechanisms for the Qt Framework (Abstract)
Ruth Ivimey-Cook, eLifeSciences Publications Ltd., Cambridge, UK
Verification Extensions for Programming Languages (Abstract)
Peter H. Welch, School of Computing, University of Kent, UK
12:00 CPA AGM and awards
12:30 Lunch
14:00 End of CPA 2014

Keynote Presentations

De-skilling CSP: Why FDR Needs Lazy Compilation
Bill ROSCOE, Department of Computer Science, University of Oxford, UK

Abstract. Since FDR was first developed in 1991, it has had a huge impact on the way CSP is written. FDR optimises processes that are the parallel combinations of a number of individual processes, each of which is significantly smaller than the size of the overall system we might contemplate checking. Some of the most cunning pieces of CSP coding, most notably the cryptoprotocol intruder, have been created specifically to get around the restrictions this optimal model imposes. In this lecture I will look at the problem of modelling state machines with a few control states but many variables, and will show how if FDR were, instead of compiling all of component processes ab initio, it did so by need, or lazily, we might expect get much simpler CSP scripts for modelling such systems as well as examples such as the intruder.

Preprint (PDF)

Parallel Systems from 1979 to 2014: 35 Years of Progress?
Roger SHEPHERD, Chipless Ltd., Bristol, UK

Abstract.  [Ed: Roger's presentation is based on the Invited talk he gave at the 13th. International Symposium on Parallel and Distributed Computing (Marseille, France, June 24-27 2014). Please follow this link for his Abstract, which appears in the ISPDC proceedings.]

The COMPASS Modelling Language: Timed Semantics in UTP
Jim WOODCOCK (a), Jeremy BRYANS (b), Samuel CANHAM (a) and Simon FOSTER (a)
(a) Department of Computer Science, University of York, UK
(b) School of Computing Science, Newcastle University, UK

Abstract. We describe the denotational semantics of a subset of the COMPASS Modelling Language (CML), using Hoare & He's Unifying Theories of Programming. The subset consists of rich state and operations based on VDM, concurrency and communication, based on CSP, and discrete time, based on Timed CSP. Other features of CML not treated here include object orientation, pointers and object references, and mobile processes and channels; extensions planned for the future include priority, probabilistic choice, and continuous time. A rich collection of language features such as this presents significant challenges when building a formal semantics, so the approach taken in CML is compositional: each feature is given a separate semantics and a domain-specific language is then composed from whichever features are required for the job in hand. Composition is achieved from the use of Galois connections. In this paper, we describe the semantics for the timed, imperative process algebra subset of CML. We adopt a semantic domain suggested by Lowe & Ouaknine — timed testing traces — as the basis for our UTP semantics. We include an example CML specification taken from industry: a leadership election protocol for a system of systems.

Preprint (PDF)

Accepted Papers

A Service-oriented Scalable Dictionary in MPI
Sarwar ALAM, Humaira KAMAL and Alan WAGNER, Department of Computer Science, University of British Columbia, Canada

Abstract. In this paper we present a distributed, in-memory, message passing implementation of a dynamic ordered dictionary structure. The structure is based on a distributed fine-grain implementation of a skip list that can scale across a cluster of multicore machines. We present a service-oriented approach to the design of distributed data structures in MPI where the skip list elements are active processes that have control over the list operations. Our implementation makes use of the unique features of Fine-Grain MPI and introduces new algorithms and techniques to achieve scalable performance on a cluster of multicore machines. We introduce shortcuts, a mechanism that is used for service discovery, as an optimisation technique to trade-off consistency semantics with performance. Our implementation includes a novel skip list based range query operation. Range-queries are implemented in a way that parallelises the operation and takes advantage of the recursive properties of the skip list structure. We report the performance of the skip list on a medium sized cluster with two hundred cores and show that it achieves scalable performance.

Preprint (PDF)

Hardware Ports - Getting Rid of Sandboxed Modelled Software
Maarten M. BEZEMER and Jan F. BROENINK, Robotics and Mechatronics, CTIT Institute, University of Twente, The Netherlands

Abstract. Model-Driven Design (MDD) techniques can be used to design control software for cyber-physical systems, reducing the complexity of designing such software. Most MDD tools do not provide proper support to interact with the environment of the models, i.e. the actuators and sensors of a cyber-physical system, resulting in the models being sandboxed. So-called Hardware Ports are proposed in this paper to get rid of this sandboxed model issue. These hardware ports are designed in a modular way, preventing a completely separated implementation for each type of piece of hardware. Even though the concept of hardware ports is generally applicable, in this paper the TERRA MDD tool and the LUNA concurrent runtime environment are used as examples to clarify the design and implementation. The paper concludes with a reflection on the usability of the hardware ports and on the planned future work to further improve the interaction between modelled software and the hardware it controls.

Preprint (PDF)

Performance of Periodic Real-time Processes: a Vertex-Removing Synchronised Graph Product
Antoon H. BOODE and Jan F. BROENINK, Robotics and Mechatronics, CTIT Institute, University of Twente, The Netherlands

Abstract. In certain single-core mono-processor configurations, e.g. embedded control systems, like robotic applications, comprising many short processes, process context switches may consume a considerable amount of the available processing power. For this reason it can be advantageous to combine processes, to reduce the number of context switches. Reducing the number of context switches decreases the execution time and thereby increases the performance of the application. As we consider robotic applications only, often consisting of processes with identical periods, release times and deadlines, we restrict these configurations to periodic real-time processes executing on a single-core mono-processor. These processes can be represented by finite directed acyclic labelled multi-graphs. The vertex-removing synchronised product of such graphs gives graphs that represent processes which have less context switches. To reduce the memory occupancy, the vertex-removing synchronised product removes vertices that are not reachable; i.e. represents states that can never occur. By means of a lattice, we show all possible products of a set of graphs, where the number of products is given by the Bell number. We finish with heuristics from which a set of graphs can be calculated that represents a set of processes that will not miss their deadline and which fits in the available memory.

Preprint (PDF)

FDR into The Cloud
Thomas GIBSON-ROBINSON and Bill ROSCOE, Department of Computer Science, University of Oxford, UK

Abstract. In this paper we report on a successful extension to the CSP refinement checker FDR3 that permits it to run on clusters of machines. We demonstrate that it is able to scale linearly up to clusters of 64 16-core machines (i.e. 1024 cores), achieving an overall speed-up of over 1000 relative to the sequential case. Further, this speed-up was observed both on dedicated supercomputer facilities, but more notably, also on a commodity cloud computing provider. This enhancement has enabled us to verify a system of 10^12 states, which we believe to be the largest refinement check ever attempted by several orders of magnitude.

Preprint (PDF)

Process Discovery in Highly Parallel Distributed Systems
Jon KERRIDGE, School of Computing, Edinburgh Napier University, UK

Abstract. In distributed data processing systems it may happen that data arrives for processing for which the appropriate algorithm is not available on the specific node of the distributed system. It is however known that the required algorithm is available somewhere within the distributed system. A mechanism for achieving the migration of algorithms within such a system is described. An implementation of the mechanism is presented that utilizes the concept of mobility of processes over TCP/IP networks.

Preprint (PDF)

N Queens on an FPGA: Mathematics, Programming, or Both?
Jan KUPER and Rinse WESTER, Computer Architecture for Embedded Systems, CTIT Institute, University of Twente, The Netherlands

Abstract. This paper presents a design methodology for deriving an FPGA implementation directly from a mathematical specification, thus avoiding the switch in semantic perspective as is present in widely applied methods which include an imperative implementation as an intermediate step. The first step in the method presented in this paper is to transform a mathematical specification into a Haskell program. The next step is to make repetition structures explicit by higher order functions, and after that rewrite the specification in the form of a Mealy Machine. Finally, adaptations have to be made in order to comply to the fixed nature of hardware. The result is then given to CλaSH, a compiler which generates synthesizable VHDL from the resulting Haskell code. An advantage of the approach presented here is that in all phases of the process the design can be directly simulated by executing the defining code in a standard Haskell environment. To illustrate the design process, the N queens problem is chosen as a running example.

Preprint (PDF)

Towards Millions of Processes on the JVM
Jan Bækgaard PEDERSEN and Andreas STEFIK, Department of Computer Science, University of Nevada, Las Vegas, USA

Abstract. In this paper we show how two previously published rewriting techniques for enabling process mobility in the JVM can be combined with a simple non-preemptive scheduler to allow for millions of processes to be executed within a single Java Virtual Machine (JVM) without using the built-in threading mechanism. The approach is tailored toward efficient execution of a large number of (CSP style) processes in Java bytecode running on the Java Virtual Machine. This may also prove useful for languages that, like ProcessJ, target the JVM as an execution platform and which need a much finer level of threading granularity than the one provided by the Java programming language system's threading mechanism.

Preprint (PDF)

Synchronous Message Exchange for Hardware Designs
Kenneth SKOVHEDE and Brian VINTER, Niels Bohr Institute, University of Copenhagen, Denmark

Abstract. In our 2013 paper, we introduced the idea of modeling hardware with PyCSP. Encouraged by our initial success we started a master’s project where two students continued our work towards a fully detailed processor built in PyCSP. The two students succeeded, but also identified a number of reasons why PyCSP is not well suited for modeling hardware. Their conclusion was that since the hardware is synchronous, communication is frequently based on broadcast and external choice is never used. This means that PyCSP does not provide the mechanisms that are needed, and the strength of PyCSP is never utilized. In this work we introduce a new messaging framework, Synchronous Message Exchange, SME, which greatly simplifies hardware modeling, and is also easily used for other strictly synchronous applications, such as a subset of computer games. We describe the SME framework, and show how it has a rather simple equivalence in CSP so that the properties that are associated with CSP based applications are maintained, except rendezvous message exchange.

Preprint (PDF)

Deriving Stencil Hardware Accelerators from a Single Higher-Order Function
Rinse WESTER and Jan KUPER, Computer Architecture for Embedded Systems, CTIT Institute, University of Twente, The Netherlands

Abstract. Stencil computations can be found in a lot of scientific and engineering applications. Parallelization of these applications becomes more and more important in order to keep up with the demand for computing power. FPGAs offer a lot of compute power but are considered hard to program. In this paper, a design methodology based on transformations of higher-order functions is introduced to facilitate the parallelization process. Using this methodology, efficient FPGA hardware architectures are derived achieving good performance. Two architectures for heat flow computations are synthesized for FPGA and evaluated. To show the general applicability of the design methodology, several applications have been implemented.

Preprint (PDF)

Workshops

Communicating Process Architectures - by Example
Jan F. BROENINK, Robotics and Mechatronics, CTIT Institute, University of Twente, The Netherlands

Aims

The aim is to show benefits and advantages of the CPA approach through the development of practical demonstrators. Examples range from essential, basic cases towards the challenging ones. Also, benchmark cases – or cases that are used so often that they grow towards a benchmark, like the commstime example – will be developed.

Action

The work of the workshop can be approached from two directions:

  1. The challenge approach: here rather complex problems are tackled to show the advantages of the CPA methods, techniques and tools in which the participants are proficient, and maybe have a presentation about in the regular conference (or at an earlier CPA). This approach is (comparable to) the CPA2012 (Dundee) challenge, where we had one massive parallel problem and one embedded problem to solve, such that participants could pick the one that fits best to their application area.
  2. The compendium approach: here several essential problems are tackled to show the advantages of the CPA methods, techniques and tools in which the participants are proficient. For the more benchmark-like problems, the CPA-type approach can be compared with other (competing?) approaches, to show the specific benefits of the CPA approaches. The problems in this approach should be reported in the compendium from basic & essential towards extensive & complex.

The envisaged outcome of this workshop is a set of well-documented problems / cases, which have elegant solutions. Benefits of the CPA approach are shown, but not necessarily pushed forward. If possible, comparison with other approaches can be made.

Rough ideas / problems (challenge approach)

  • Urban area traffic monitoring system, presenting congestion, and predicting congestion, thus advising effective routes for traffic units.
  • Mobile device, running several apps like mail, location-aware info (augmented reality?), controlling your home (domestics?)
  • Mobile robot (e.g. lego mindstorms), with controller functionality organised in layers.

Rough ideas / problems (compendium approach – the usual suspects)

I think a lot of these were used as examples / cases in CPA / WoTUG papers ... and give me the inspiration for this list ...

  • basic producer – consumer
  • commstime
  • dining philosophers
  • service provider, like a webserver...
  • game like: Pacman
  • MMORPG / MMOG – massively multiplayer online variants of Pacman or others.

See massively multiplayer online (Wikipedia) for further inspiration, or to understand that making these kind of applications is quite some work.


Communicating Process Architectures: a Personal View of the Past, Present and Future
Barry COOK, 4Links Ltd., UK

Background

CPA is the continuation of a long series of conferences related to a computer development in the 1980s. It appears that this conference now has some difficulty attracting papers and, perhaps to a lesser extent, attendees. I am wondering why this is. Could it be:

  • The subject matter is dead?
  • It is unclear what the subject matter is?
  • There is no place for a CPA-style conference?
  • Something else?

This provokes further questions:

  • Is this the end-of-the-line for CPA as it stands?
  • Can it be run differently and live again?
  • Is it time to say that CSP is a good idea but no-one wants it?
  • Or is there some practical way forward that is worth the effort?

Action

I will present some thoughts on these questions. I do not necessarily have a view on how things should go.

The purpose of the workshop is to provide some time and a place where answers can be floated and discussed. With luck, effective ways forward will emerge to which many (all?) of us will commit to work.

Final note

In many ways, things are depressing – it is very unlikely we can continue in the same way, even for one more time. However, there are great opportunities to do something worthwhile – if only we can work what to do and find some way to do it. Let's analyse the situation and either stop for good reason or carry on with a realistic plan that leads to results with which we can be pleased to be part.

[These notes are extracted from a 5 page position paper available here.]


CSP mechanisms for the Qt Framework
Ruth IVIMEY-COOK, eLifeSciences Publications Ltd., Cambridge, UK

Background

The Qt Framework is the foundation of a huge amount of C++ software, including the KDE desktop environment, several mobile phones, and numerous individual applications. It has now in its 5th major version and has been very well thought out and put together. Qt is unusual in using a preprocessor (called moc) to extend the C++ language with meta-information and an efficient event system, which can be used both within and between threads. Part of the core of Qt for a long time has been a threads implementation for concurrency which is used, and has seen some work recently, though it is still at the periphery of the framework.

Action

Identify how to include a set of CSP-inspired constructs into Qt that usefully extend the framework and provide good building blocks for further work.

This could be done in conjunction with the Qt maintainers, but there is plenty of history of third-party libraries extending Qt – for example the Qxt library providing graphing tools.

There is scope for several people – both theorists and programmers – getting involved in this. C++ experience for the programmers would be needed, and preferably exposure to Qt.

It would be nice if the CCSP library could be brought into the mix, but I don't see that as an absolute requirement.

It is unlikely that the workshop will complete this work, so it would be nice if those who worked on it were able to follow up afterwards. If not, then the group doing it should aim to leave their offering in a state that it could be picked up by others, e.g. on Github.

Benefit

Qt is a major framework and a good implementation of CSP would have major exposure if done well. Getting to that point will not be trivial but if this conference's legacy was just to do that, I would be ecstatic. Qt is in my view a good target for this as it already includes a number of features which make it amenable to this treatment.


Fine-Grain MPI for Extreme-Scale Programming
Alan WAGNER, Department of Computer Science, University of British Columbia, Canada

Background

Scalable computing is needed to meet the challenge of computing on ever-increasing amounts of data. Multicore and many-core adds more processing cores but still cannot keep pace with the demand for more computation. There is need for technologies to scale, not only to multiple cores within one chip but also to scale across multiple chips in one machine, and across machines in a data center.

Process Oriented programming is the design of programs as a collection of processes that communicate using message-passing. Process-oriented programs execute concurrently either by: (a) logical concurrency: sharing one processor core and taking turns executing, or (b) physical concurrency: executing in parallel on separate cores in one or more machines. Extreme scale programming is the term we use for the design of programs that use thousands and millions of such processes.

Action

Fine-Grain MPI (FG-MPI) extends the MPICH2 runtime to support execution of multiple MPI processes inside an OS-process. FG-MPI supports fine-grain, function level, concurrent execution of MPI processes by minimizing messaging and scheduling overheads of processes co-located inside the same OS process. FG-MPI makes it possible have thousands of MPI processes executing on one machine or millions of processes across multiple machines. FG-MPI is available online from our UBC website. The purpose of the workshop will be a practical hands-on introduction to FG-MPI: how to create FG-MPI programs and run them using ``mpiexec''. I will discuss the added flexibility in executing programs and limitations. I will discuss applications and tools we have started to develop and potential extensions.

For the practical part of this workshop, it will help to have downoaded the latest FG-MPI release, together with the code examples on the download page. These are available from the "Downloads" tab on the home page. However, the work can still be followed with just pencil and paper.

[A longer version of the above is available here.]


Making JCSP Faster: Re-implement with Atomic Operations and Verify its Correctness
Peter H. WELCH, School of Computing, University of Kent, UK

Background

JCSP is a substantial library for Java providing the occam-pi subset of CSP, together with pi-calculus mobility for channel-ends and processes. In fact, it supports more CSP than occam-pi, having pioneered the development of a fast implementation of external choice over multiway events – which also yields output guards – and full advantage of multicore processors. These extras integrate with the library in a way that imposes no overhead on choice, or committed channel communication, when multiway events are not involved. JCSP also provides higher level synchronisations (e.g. variant CALL channels), lots of teaching material and many demonstrations.

The current JCSP implementation builds upon the standard Java model of threads and monitors (i.e. synchronized methods and blocks, wait, notify and notifyAll). Its algorithms are based on the very low-level concurrency instructions implemented by Transputer microcode (whose implementation via high-level monitor constructs has always seemed peculiar, though we had no alternative when that work was done). The correctness of JCSP channel communication and choice, with respect to CSP channel communication and choice, has been verified through the construction of a CSP model of Java monitors (as just noted, a high-level synchronisation mechanism and, therefore, somewhat complex), CSP modelling of the JCSP algorithms that use them and the FDR model checker. This verification has been a vital factor in the stability of JCSP since 1998 and was reported at CPA 2000 (Formal Analysis of Concurrent Java Systems).

Action

Re-implement the core JCSP methods for channel communication, barriers and choice (alting) using Java atomics, semaphores and (possibly) locks. These should be able to follow the Transputer microcoded algorithms in a simpler way than the current JCSP implementation using Java monitors.

Verify the correctness of the new implementation through:

  • CSP modelling of the atomics, semaphores etc. used – this should be far simpler than the CSP modelling of monitors;
  • translating the new JCSP algorithms into CSP (using the models described in the previous bullet);
  • following the route from the CPA 2000 paper to put the right questions to FDR3.
An implementation without such verification will lead, sooner or later, to tears.

Completing these actions may be beyond the time available in this workshop. A realistic ambition may be to implement and verify 1-1 channels, excluding alting, and to make a simple demonstration (e.g. CommsTime) from which timing performance can be observed. Because the implementation will still rest upon Java threads, which in turn rest upon the underlying operating system, we should not expect a startling improvement in performance (e.g. towards occam-pi levels) – but there should be something worthwhile. And we might be surprised!

Members attending this workshop should come armed with pencil and paper and some familiarity with the JCSP algorithms and CSP modelling presented in the CPA 2000 paper – at least the Java code in section 3.2 and, if we get around to alting, section 7. Looking at the slides supporting the paper may help, especially for the CSP modelling. Hard core engineers might also bring a laptop loaded with JDK (1.7, at least), JCSP (1.1 rc4) and FDR3, together with the FDR script verifying the current JCSP implementation.

[A longer version of the above is available here.]


Verification Extensions for Programming Languages
Peter H. WELCH, School of Computing, University of Kent, UK

Background

As the world becomes dependant on computer systems to sustain and advance its way of life, the correctness and, therefore, safety of those systems becomes essential. Yet the reliability of the software controlling our computers has barely improved over the past few decades, even as the speed with which they can go wrong has risen exponentially.

Software verification is not part of the normal practice of software engineering, probably because it is perceived to be too difficult and is, currently, so time consuming compared to the practice of generating code ("with no obvious deficiencies", C.A.R.Hoare). Instead, verification is reserved for software whose failure would certainly kill or injure people (e.g. certain elements of flight avionics) or lose lots of money (e.g. banking services).

The difficulties of verification must therefore be removed. It is the responsibility of mathematicians and engineers with the rare specialist skills needed (today) for verification to find ways to do themselves out of this particular line of work. They can do this by building those skills into the tools used daily to construct code: most notably, programming languages.

History

At CPA 2011, a proposal was made to start addressing the above (Adding Formal Verification to occam-pi). The proceedings (and the link just given) only hold an abstract for that presentation, but details are in the accompanying slides. More information, however, is in two sets of slides updated from those at CPA 2011 for talks given later that year: Self-Verifying Concurrent Programming (PPT version) and Self-Verifying Dining Philosophers (PPT version).

The above slides give details for the translation from occam-pi into CSP-M, allowing the programmer to make FDR-like assertions and set the level of abstraction in the translated model (e.g. which variable values are tracked) sufficient for the assertions to hold (or fail), but not so light that the state space is too vast even for FDR to explore. The examples from the slides were verified using FDR2, with hand translation from occam-pi to CSP-M using the rules given.

Action

Following CPA 2011, I was hopeful that we would be able to produce a working demonstrator within a year. This would require extending the occam-pi compiler to generate CSP-M scripts, feeding the scripts through FDR and re-presenting the results to the programmer in terms of the occam-pi sources. Earlier work on a prototype translator from raw occam to CSP was reported by Fred Barnes in a fringe presentation at CPA 2008 (Towards Guaranteeing Process Oriented Program Behaviour). Unfortunately, modern academic life is such that compiler effort at Kent has stalled.

The aims of this workshop are therefore:

  • re-consider the CPA 2011 proposal;
  • change it in the light of relevant happenings since then;
  • change it because someone has better ideas;
  • identify the tasks for building (at least) a demonstrator implementation;
  • size those tasks (time and effort);
  • find people, including end-users, willing to take this further.
Finally, we note (and understand some of the reasons for) the low level of interest remaining for occam. The ideas of this workshop are not confined to occam, though they were inspired by it and it would be easiest (technically) to start from there. The workshop should also consider re-shaping all these ideas for other languages, though they would need to offer a concurrency model fairly closely compatible with CSP. Candidates might include Go, Java (with JCSP) and Scala (with Bernard Sufrin's CSO, first presented at CPA 2008).

[A longer version of the above is available here.]


Why is CSP Needlessly Difficult to Learn and so Easy to Forget, and Why are its Semantics Upside Down?
Peter H. WELCH, School of Computing, University of Kent, UK

Background

There has been a gap between the ways CSP and occam developed that's troubled me for a long time. My problem is that I think "occam" and that leads me to make mistakes in "CSP". Of course, my answer is that CSP must be doing it wrong and needs fixing. And, of course, I am probably very wrong!

Nevertheless, I shall attempt to explain my difficulties with CSP, with respect to simplicity, programming concepts and an old-fashioned (but arguably more natural) view of semantics.

Outline of Difficulties

  • CSP has two operators for sequence ("-->" and ";") whereas occam has one ("SEQ"). CSP needs this because events are not processes. If they were (as in occam), simplifications follow.

  • The single global space of event names in CSP requires a hiding operator ("\"). This is so that networks can be constructed whose processes privately synchronise – but they have to be built with global events that are hidden post-construction. This causes problems – e.g. for recursively defined networks. In occam, name spaces follow traditional block structure. Networks with internal synchronisations do not use global events that must be hidden later. This is a simpler idea and avoids the problems mentioned.

  • In CSP, channel arrays are identical to a single channel carrying the index with the message. This is odd as they describe rather different things.

  • Complex events – e.g. d?x?y!f(x,y) – are more complex and less general than the two-way, or session, protocols proposed for occam-pi. The lack of direction specification for their components leads to unnecessary mistakes. This also applies to simple events, when those events are channel messages.

  • Modern CSP forsook the original idea (in Hoare's CSP) of associating event alphabets with processes. Instead, they are associated with a range of parallel operators. I think this is so as to give flexibility, when building networks, for specifying whether processes synchronise on events or interleave on them. The downside is that alphabets are not constructed once for each process; they have to be constructed each time each process is combined in parallel with another and each time any combinations are combined. Getting all these alphabets correct can become complex quite quickly, leading to a tyranny of alphabets.

  • occam alphabets are derived automatically and associated with processes. There is only one parallel operator ("PAR") and it is trivially associative and commutative. Associativity does not come easy for the parallel operators of modern CSP. occam-pi enables great flexibility for processes to declare whether they synchronise or interleave on events. CSP could do the same, with gains in simplicity. It also seems more natural for the process designer, rather than process user, to specify these things.

  • Finally, why does the semantics encourage design by starting from too much behaviour and then cutting it down, by refinement, to what is wanted. Everything refines CHAOS, but that's not a good place to start. Most programmers – well, me anyway – like to build behaviour incrementally. Start from bottom, the process with no behaviour (STOP), and then add – don't cut!

Action

Discuss the merits of the above. Formalise it (if worthy) into a new syntax for CSP. Produce useful examples comparing the current syntax with the proposed. Consider the semantic issues. Propose further work.

[A longer version of the above is available here.]


Fringe Presentations

The Need for Prioritised Alternation in a Programming Language
Ian EAST, Open Channel Publishing Ltd., UK

Abstract. A programming language embodies a model for the abstraction of systems behaviour. The domain of software application has changed dramatically since the early days of programming, and continues to evolve. While there has recently been an awakening to the need to express concurrency, little respect has been paid to either prioritisation or pre-emption, despite their increasing relevance to modern applications. Arguments are presented for the incorporation of a prioritised alternation when construct, previously proposed by the author (CPA 2004), in any programming language intended for the description of embedded systems. (This reflects the operation of hardware prioritised vectored interruption.) It is further argued that prioritisation must apply to events alone and must be declared in advance as the context for an alternation. Comments will be made, and discussion invited, with regard to the semantics of both prioritisation and its composition.

A Personal Perspective on The State of Parallel Processing on the Transputer's 30th Anniversary
Ruth IVIMEY-COOK, eLifeSciences Publications Ltd., Cambridge, UK

Abstract. This year is 30 years since the Transputer's birthday in 1984. This talk reviews some of the achievements in parallel processing that are down to the Transputer and some that are not. It is born out of a review paper, currently in preparation on this topic, which will be available in draft form at the talk. The aim, when complete, is to cover developments and acceptance of parallel processing ideas both for hardware (from, for example, the CDC 6600) and software (where occam lies somewhere in the middle ... or the future). Key papers and technologies will be referenced to highlight progress over the decades – bearing in mind the question mark at the end of the title of Roger Shepherd's Keynote presentation!

Collaborators are actively sought to expand on those areas this author is not well versed in, with a view to publication as co-authors in the not too distant future.

A Real-time Garbage Collection Technique for Concurrent Flash Storage Architecture
Muhammed Ziya KOMSUL and Alistair A. MCEWAN, Department of Engineering, University of Leicester, UK

Abstract. Flash-based storage systems offer high performance, robustness, and reliability for embedded applications; however flash memory has limitations to its usage in high reliability applications. In previous work, we have developed RAID architectures and associated controller hardware that increase the reliability and lifespan of these storage systems by maintaining strict control over the lifespan of each chip in the array. However, flash memory needs regular garbage collection in order to reclaim its invalid spaces. When a garbage collector is active, the chip cannot be used by the application layer. This causes non-deterministic response time which is a problem for systems that require real-time guarantees. Recent techniques have been proposed to address the non-deterministic issue at single chip level; however effects of these techniques have not been investigated with concurrent flash architectures such as our RAID mechanism. We have applied a real time garbage collection technique over our RAID mechanism and observed that it affects strict control over the lifespan of the chips in the array. Thus, we have proposed an optimised real time garbage collector for our RAID mechanism. We evaluate our technique with a set of traces running on a software-based flash storage system simulator.

T42 – Transputer Design in FPGA
Uwe MIELKE (a), Martin ZABEL (b) and Michael BRUESTLE (c)
(a) Electronics Engineer, Dresden, Germany
(b) Institut für Technische Informatik, Technische Universität Dresden, Germany
(c) Electronics Engineer, Vienna, Austria

Abstract. This fringe paper is discussing the current status of a still ongoing T42x compatible 3 stage pipelined Transputer design in FPGA. The intention of this small project is to provide the VHDL as open source, so that many cores can be loaded into a FPGA board for education or exploration purposes. To evaluate the public attraction of such a Transputer design today, a performance outlook of our T42 design and potential successors is projected into a target corridor and compared with other available softcores in FPGA. Our conclusion is: w/o a future design roadmap the T42 will remain academic, but once improved in architecture it can compete with any RISC softcore CPU. This is challenging goal ... any support is welcome.

Occam (Raspberry) Pi
Richard MILLER, Miller Research Ltd., Oxford, England

Abstract. The Raspberry Pi is a small, inexpensive ARM-based computer for students and embedded hardware tinkerers. The KRoC implementation of occam runs on the Pi, thanks to the highly portable Transterpreter, but performance is limited compared to native compilation. Runtime profiling of the virtual machine shows a majority of time spent in instruction decode and dispatch, even in communication-intensive benchmarks. A bit of hand tuning concentrated in this area gives a significant speedup to the VM, which is further improved (at a cost of portability) by reimplementing the 15 primary transputer opcodes in ARM assembler. Decode and dispatch are eliminated altogether by a just-in-time translation from bytecodes into a directly executable form, but it's not possible in all cases to find a correct translation without more semantic information than the Transterpreter bytecode file currently provides.

CSP In Orbit - a Recent Satellite Application
Roger PEEL, Barry COOK and Paul WALKER, 4Links Ltd., UK

Abstract. We have used the philosophy of CSP to guide the design of a successful range of products for the Space industry. All revolve around a development of the Transputer communication link – SpaceWire – based on the T9000 link with changes (not all are improvements) to suit this sector. SpaceWire is used increasingly, worldwide (UK, USA, EU, Japan, Russia, etc.), for on-board data communications in a wide range of satellites.  [The Transputer has been used in successful missions - some still functioning – and is even being designed into new spacecraft!]

Spacecraft are hard to repair after launch and on-board systems contain multiple, redundant, copies of important units with a switch-over algorithm to ensure continued operation. Network, rather than traditional bus, topologies allow Fault Detection, Isolation and Recovery (FDIR) algorithms considerably more freedom and are subject to ongoing research. We are currently part of a project funded by the European Space Agency (ESA) and led by Astrium (now Airbus) delivering a system of units, each of which provides a hardware simulation of an on-board unit. Astrium/Airbus will implement a proposed FDIR algorithm that will be run in real-time with faults introduced and the resulting behaviour observed and checked.

Although we are only now delivering first units to Astrium/Airbus, we took a small system to the Farnborough International Airshow with a very simplified FDIR algorithm and were able to show correct recovery from deliberately introduced failures. We will demonstrate the system in operation – an example of a real, industrial, application of CSP.

Adding Concurrency to an Evidence-based Programming Language
Andreas STEFIK and Jan Bækgaard PEDERSEN, Department of Computer Science, University of Nevada, Las Vegas, USA

Abstract. In the last five years, programming language designers have increasingly investigated data-driven methods for studying the impact of competing designs. This approach has led to an imperfect, but objective, methodology for studying the long-standing programming language wars. To-date, a variety of questions have been increasingly studied using this approach. For example, there is now scientific, peer-reviewed, evidence that static typing improves human productivity (under known conditions). Further, differences in syntax appear to impact humans much more than was once thought.

While data-driven methods have provided an objective filter for evaluating language design, how to apply them to concurrency is not clear. On the one hand, syntax might have an impact in this area, but to whom is unclear. On the other, fundamental design differences may also be impactful, like software transactional memory vs. threads or process-oriented approaches, but questions about paradigm can often be too broad to test easily in a small number of experiments. In this talk, we discuss data-driven methods in language design and how they can potentially be applied to concurrency. The goal is not to establish a particular approach, but to garner feedback from the community on potential experimental designs, with an eye toward focusing efforts toward questions the community would find valuable.


Pages © WoTUG, or the indicated author. All Rights Reserved.
Comments on these web pages should be addressed to: www at wotug.org