Newsgroups: comp.parallel From: dejan@tesla.osf.org (Dejan Milojicic) Subject: Book on load distribution Keywords: task migration, load distribution, Mach Organization: Open Software Foundation Date: Wed, 15 Jun 1994 18:04:30 GMT Message-ID: Dear Colleagues! My thesis, entitled "Load Distribution, Implementation for the Mach Microkernel" has been published as a book by Vieweg Verlag, Wiesbaden, Germany. The thesis was defended at the University of Kaiserslautern, Germany, in December 1993. Below I include the book forward, written by my advisor prof. Juergen Nehmer, followed by four reviews written by David Black, Fred Douglis, Alan Langerman and Brent Welch. These are the people (in alphabetical order) who have followed my work and helped me on a number of occasions. All of them have considerable experience in the field of distributed systems and also with process and task abstractions. Therefore they are competent to review my book and my work. The book is of practical character and describes my experience while developing load distribution on top of the Mach microkernel. It also surveys the field and contains an extensive bibliography. The book is available through Vieweg representatives worldwide. Addresses and phones of Vieweg representatives as well as table of contents are included at the end of this email. Dejan S. Milojicic, OSF Research Institute, Cambridge, Massachusetts. ***************************************************************** The Book Foreword by Prof. Juergen Nehmer University of Kaiserslautern Load distribution is a very important concept for distributed systems in order to achieve better performance, resource utilization and response times. Providing efficient mechanisms for the transparent support of load distribution has proven to be an extremely difficult undertaking. As a matter of fact, there is no commercially available system which provides transparent load distribution right now. The monograph by D. Milojicic presents a novel load distribution scheme based on modern microkernel architectures. The remarkable results of D. Milojicic's approach show evidence for his hypothesis that load distribution is feasible even under strong efficiency constraints if built upon microkernel architectures. Based on a complete implementation using the NORMA-version of Mach, D. Milojicic shows that substantial performance improvements of his load distribution scheme on top of Mach result from the dramatic reduction of state information to be managed in course of a task migration. For readers not familiar with the topic, the monograph gives a good survey of the load distribution problem and puts existing approaches into perspective. ***************************************************************** Review by David Black Load Distribution Implementation for the Mach Microkernel by Dejan Milojicic This monograph describes the implementation of a load distribution system for an environment based on the Mach microkernel. This work incorporates a task migration mechanism, load information management support and a distributed scheduler. A major innovation (and contrast with previous work on process migration) is that the migration mechanism operates on the microkernel task abstraction rather than the (higher level) process abstraction of an operating system such as Unix. In a similar fashion, the load information is based on activity at the microkernel level (processing, IPC, and memory faults). The overall result is a prototype system that demonstrates by example the feasibility of applying migration at the task level. The simplicity that results from migrating tasks rather than processes (and hence not having to deal with process-specific state) is both an advantage and a disadvantage. It is advantageous for its relative ease of implementation and because the resulting mechanism is applicable to the multiple operating system personalities based on Mach. The disadvantage is that it is possible to migrate a task away from its provider(s) of operating system services, and hence incur remote communication costs for accessing those services. This issue and means for addressing it are discussed (under the heading of "Lessons Learned"). Overall, this monograph describes some impressive work. The reader should understand that the described work is intended to prove that the concepts and techniques are viable, and does not claim to be a complete commercially usable system. As a proof of concept, it is quite convincing, and I recommend this monograph to those interested in distributed operating systems, process/task migration, and related areas. There is also an excellent review of related work at appropriate points in the text; the review and bibliography constitute a good starting point to further explore this field. David L. Black, OSF Research Institute ____________________________________________________________ Dr. David Black is Research Scientist at the Cambridge office of the OSF Research Institute. He holds a Ph.D. degree from Carnegie Mellon University. Among his many research interests are microkernel technology, real-time, parallel and distributed computing. At Carnegie Mellon and OSF, he was one of the key developers of the Mach operating system. He served as a program chair for the third Usenix Mach Symposium and is a member of the program committee for the Usenix OSDI symposium. ***************************************************************** Review by Fred Douglis Load Distribution Implementation for the Mach Microkernel Dejan Milojicic Dejan Milojicic's monograph entitled "Load Distribution Implementation for the Mach Microkernel" is an important contribution to the field of load distribution and process migration. The book is important for a number of reasons. First, it provides a separation between the low-level kernel abstraction of a task (an address space, threads, and communication channels) and the higher-level abstraction of a process (with signals, files, a current working directory, etc.) The task migration facility can be used to implement process migration for multiple OS personalities such as OSF/1 or BSD UNIX. It has support for shared memory, using the underlying Mach NORMA facilities, unlike many other implementations of migration. Second, the mechanism described in the book is implemented in Mach, which is a commercially successful operating system. This contrasts to other migration facilities, such as the one I built for Sprite, which have failed to achieve wide-spread acceptance. Furthermore, by continuing his work under the auspices of the OSF Research Institute, Milojicic has the potential to move his research fully into the commercial domain. Third, the book describes a load distribution facility using extended information such as statistics about interprocess communication and shared memory. Milojicic has extended previous work in the area to provide a more comprehensive, higher-performance facility. Finally, the book provides a comprehensive survey of existing research in process migration and load distribution. It goes into greater detail than papers in the literature or other theses of which I am aware. The bibliography is extensive and will serve as an excellent starting point for others interested in this area. The book does have a couple of drawbacks that should be noted. One is that pre-publication editing did not fully address the language problems that can face a non-native speaker of English. Missing articles are commonplace and there are a few other idiosyncratic phrases that stand out. The book is still easily readable, but the grammar can sometimes be distracting, and the reader should be prepared. The other drawback, on a more technical level, is that some of the measurements in the book are based on a smaller testbed (3 hosts) than would be desirable for experiments in load distribution. The author notes this in one place, observing that with a random placement policy (shown for comparison purposes), the first host has a 50% chance of correctly selecting from between the two other hosts. In general, I would expect somewhat different results when considering multiple scheduling algorithms across a large number of machines than across just a few. The limited testbed was due to the resources available during the author's dissertation work at the University of Kaiserslautern. All told, I recommend the book to researchers interested in the area of load distribution, and especially to anyone considering load distribution and/or process migration in a UNIX or UNIX-like environment. - Fred Douglis, Matsushita Information Technology Laboratory (These views do not necessarily reflect those of my employer.) _________________________________________________________________ Dr. Douglis is a computer scientist at the Matsushita Information Technology Laboratory. Prior to that he was a visiting professor at the Vrije Universiteit, Amsterdam, after obtaining his Ph.D. at the University of California, Berkeley. His dissertation was in the area of process migration and load distribution. Dr. Douglis was chair of WWOS 1993, and has been on the program committees of that workshop, the 14th ICDCS, and the 4th Usenix SEDMS. Dr. Douglis is the vice-chair for Mobile Computing of the IEEE Computer Society TCOS. ***************************************************************** Review by Alan Langerman Load Distribution Implementation for the Mach Microkernel by Dejan S. Milojicic As modern computer systems grow to include ever larger numbers of processors, the issue of transparent load distribution becomes increasingly important for efficient system utilization. In some cases, distributing load can mean spreading disk I/O across many nodes or drives, or directing interrupt activity across all available processors, or other low-level resource balancing. At a higher level, load distribution means parceling out user computations to various processors, so that individual applications can execute more quickly or system throughput can be increased. Unfortunately, it is all too easy to overcommit a subset of available processors while leaving others under-utilized. Thus, the issue may be viewed as a problem of matching program entities - tasks - with available processors. This matching problem is relatively straightforward on shared memory multiprocessors because external I/O resources can be shared with little or no imbalance and because processors can pick work from a shared, global run queue. Load distribution on SMP architectures, the subject of much discussion in the early and mid-1980s, has diminished as a topic for debate. Instead, the problem today is load distribution in distributed computing systems. Information about processor utilization is no longer immediately available; typically, it is never possible to obtain an instantaneous, complete view of all applications and resources in the system. Load distribution in distributed systems is hard. However, the benefits of distributing load can be large given the relative ease and low cost with which distributed systems can be built. Load distribution has been the focus of increasing attention because of its practical importance and conceptual challenge. Dejan Milojicic has undertaken to demonstrate the practicality of implementing load distribution schemes on top of a modern microkernel architecture (in this instance, Mach). This work is novel and has important implications for the future development of distributed systems based on microkernels. His book is now available from Vieweg Verlag, Wiesbaden, Germany, 149 pages. Although originally written as a Ph.D. thesis at the University of Kaiserslautern, Milojicic's book may be enjoyed by any thoughtful computer engineer or scientist. The reader should only beware that the book is quite detailed. "Load Distribution" begins by defining terms and surveying earlier efforts in the field. This survey might well serve serve as a comprehensive introduction to these topics for an advanced operating systems course. The book then moves on to a discussion of the Mach microkernel and Milojicic's proposed load distribution scheme. This scheme rests on task migration, which is analyzed in detail. Milojicic recognizes that all load distribution problems have a fundamental dependency on gathering load information across a distributed system. The book devotes an entire chapter to this problem. An argument is constructed about addressing load information in a microkernel environment: that is, phrasing load information as a function of information about microkernel-provided abstractions such as IPC and VM. A complete, Mach-based solution is provided. Load distribution also requires a distributed scheduling solution. In the presence of load information, a number of different policies may be used for selecting the targets of task migration. The choice of policy has a profound effect on the resulting system performance. Milojicic analyzes several strategies and measures the results. In sum, "Load Distribution" is an excellent overview of the development of the art of load distribution. Dejan Milojicic advances the state of that art by analyzing the impact of load distribution on distributed systems composed of microkernels. The resulting book makes for fascinating and comprehensive reading in this area. Alan Langerman __________________________________________________________________ Alan Langerman is president of Orca Systems, Inc., a Mach- and Unix-consulting company. Orca's clients have included, among others, the Open Software Foundation, Digital Equipment Corporation, the Advanced Computer Research Institute, Encore Computer Corporation and mt Xinu. Mr. Langerman has written and lectured on abstractions and implementations for efficient operating system support of symmetric shared multiprocessors. He currently leads a project implementing fully flow-controlled, high-speed distributed Mach IPC. ***************************************************************** Review by Brent Welch Load Distribution Implementation for the Mach Microkernel by Dejan Milojicic The book "Load Distribution, Implementation for the Mach Microkernel" by Milojicic provides a good survey of research in the areas of process and task migration mechanisms, load information management, and distributed scheduling policies. The book provides a description of a load distribution system based on top of the Mach microkernel. The author provides a candid assessment of both the accomplishments and difficulties of the project. Detailed performance results are presented that show the cost of task migration based on the number of threads, size of address space, and number of communication ports associated with the task. In addition to the results of his own work, the author does a good job of presenting related work and drawing comparisons to other systems. Brent Welch Xerox-PARC _________________________________________________________________ Brent Welch has been a member of the research staff at the Xerox Palo Alto Research Center (PARC) for 4 years, ever since he received his Ph.D. from the University of California at Berkeley in 1990. His fields of research include distributed computer systems and operating systems. Dr. Welch is a member of the IEEE Computer Society and the Association of Computing Machinery (ACM), and regularly reviews articles for publication in their journals. ***************************************************************** Table of contents 1 Introduction 1.1 Motivation 1.2 Load Distribution 1.3 Research Contributions 1.4 Thesis Outline 2 Background and Related Work 2.1 Introduction 2.2 Migration 2.2.1 Design 2.2.2 Issues 2.2.3 Previous Work 2.3 Load Information Management 2.3.1 Design 2.3.2 Issues 2.3.3 Previous Work 2.4 Distributed Scheduling 2.4.1 Design 2.4.2 Issues 2.4.3 Previous Work 2.5 Summary 3 Mach and Load Distribution 3.1 Introduction 3.2 Mach 3.3 Mach NORMA Version 3.4 Mach Support for Load Distribution 3.5 Load Distribution Architecture, Overview 3.6 Summary 4 Task Migration 4.1 Introduction 4.2 General Principles and Architecture 4.3 Requirements for Microkernels 4.4 Implementation 4.4.1 Task Migration Algorithm 4.4.2 Necessary Modifications to the Mach Microkernel 4.4.3 Simple Migration Server 4.4.4 Optimized Migration Server 4.4.5 In-Kernel Task Migration 4.5 Performance Measurements 4.5.1 Migration Server Measurements 4.5.2 WPI Benchmarks 4.5.3 Parallel Make and Other Applications 4.6 Related Work 4.7 Summary 5 Load Information Management 5.1 Introduction 5.2 Load Information Collection 5.2.1 Information on Processing 5.2.2 Information on Network IPC 5.2.3 Information on XMM 5.3 Information Dissemination and Negotiation 5.4 Performance Measurements 5.5 Characterization of Mach Applications 5.6 Summary 6 Distributed Scheduling 6.1 Introduction 6.2 Distributed Scheduling Algorithms 6.3 Artificial Load 6.4 Performance Measurements 6.4.1 Comparison of Various Strategies 6.4.2 Considering Information on Communication 6.4.3 Task Migration v. Initial Placement 6.5 Summary 7 Lessons Learned 7.1 Introduction 7.2 Task Migration Implementation 7.3 Task Migration is not Always Enough 7.4 Task Migration v. Initial Placement 7.5 Microkernels are the Right Level for LD 7.6 Experiences with Network IPC 7.7 Fault Tolerance is Hard to Support 7.8 Summary 8 Conclusions and Future Work 8.1 Introduction 8.2 Summary of Results 8.3 Future Work 8.4 Conclusion Bibliography Index ***************************************************************** The following is full reference for the book, followed by the list of representatives through whom you can order the book. Dejan S. Milojicic, "Load Distribution, Implementation for the Mach Microkernel", Vieweg, 1994. (ISBN 3-528-05424-7) USA: Ballen Booksellers International, Inc. Dept. Vieweg 125 Ricefield Lane Hauppauge, N.Y. 11788 tel: (516) 543 5600 fax: (516) 864 5850 France: Teknea Distribution Vieweg 203, avenue de Fronton F-31200 Toulouse tel: 61 57 29 69 fax: 61 57 47 11 Great Britain: Momenta Publishing LTD Broadway House The Broadway Wimbledon, London SW19 1RH tel/FAX 081 542 2465 Germany: Vieweg Publishing P.O. Box 58 29 D-65048 Wiesbaden tel: + 49 611 160220 fax: + 49 611 160229 Japan: Eastern Book Services, Inc. 37-3 Hongo 3-chome, Bunkyo-Ku, Tokyo 113 tel: + 81 3 3818-0861 fax: + 81 3 3818-0864 *****************************************************************