NHSE Review 1996 May Article: Cluster Management Software -- Chapter 3 -- Cluster Management Software Packages

NHSE Review^TM 1996 Volume First Issue

Cluster Management Software

| <- PREV | Index | Next -> |
NHSE Review^TM: Comments · Archive · Search

Chapter 3 -- Cluster Management Software Packages

Commercial Packages               Vendor
Codine - Computing in Distributed GENIAS GmbH, Germany
         Network Environment
Connect:Queue                     Sterling Corp., USA
CS1/JP1                           Hitachi & Cummings Group, USA
Load Balancer                     Unison Software, USA
LoadLeveler                       IBM Corp., USA
LSF - Load Sharing Facility       Platform Computing, Canada
NQE - Network Queuing Environment Craysoft Corp., USA
Task Broker                       Hewlett-Packard Corp.

Research Packages                 Institution
Batch                             UCSF, USA
CCS - Computing Centre Software   Paderborn, Germany
Condor                            Wisconsin State University,
                                  USA
DJM - Distributed Job Manager     Minnesota Supercomputing
                                  Center
DQS 3.x                           Florida State University,
                                  USA
EASY                              Argonne National Lab, USA
far                               University of Liverpool, UK
Generic NQS                       University of Sheffield, UK
MDQS                              ARL, USA
PBS - Portable Batch System       NASA Amass & LLNL, USA
PRM - Prospero Resource Manager   University of S. California
QBATCH                            Vita Services Ltd., USA

3.1.1 Introduction

The aim of this chapter is to provide a brief description of each of the CMS packages, listed in the table above. The descriptions consist of information coalesced from vendor publicity, user guides and on-line (WWW) documents.

At least two other CMS packages; Balens (VXM Technologies Inc. USA) and JP1 (Hitachi Inc.) - formerly known as the NC Toolset. Some difficulty has been found in trying to get further information about these packages, but when found it will be added to this review.

3.2 Commercial Packages

3.2.1 Codine

URL http://www.genias.de/genias/english/codine/Welcome.html

Codine [12] is a software package targeted at utilising heterogeneous networked environments, in particular large workstation clusters with integrated compute servers, like vector and parallel computers. Codine provides a batch queuing framework for a large variety of architectures via a GUI-based administration tool. Codine also provides dynamic and static load balancing, checkpointing and supports batch, interactive and parallel jobs.

Main Features of Codine

Support for batch, interactive and parallel (Express, p4 and PVM) jobs.
Support for multiple queues.
Support for checkpointing.
Static load balancing.
Dynamic load balancing by checkpointing and job migration.
NQS interface -- used for integration with existing NQS-based systems.
Accounting and utilisation statistics.
X11 Motif GUI, command-line and script interfaces for administrators and users.
POSIX compliance.
Support for DCE technology.

3.2.2 Connect:QUEUE

URL http://www.sterling.com/

This package is a commercial variation of NQS (until recently know as Sterling NQS/Exec) that is commercially marketed and supported by Sterling Software Inc. Its feature and functionality are very similar to GNQS.

The Package provides a Unix batch and device queuing facility capable of supporting wide range of Unix-based platforms. It provides three queue types:

Batch queue, providing a percentage mechanism for scheduling and running jobs;
Device queue, which provides batch access to physical devices;
Pipe queue, which transports requests to other batch, device or pipe queues at remote locations.

An intelligent batch job scheduling system provides job load balancing across the workstation clusters being managed. Load balancing is based on a scheduling algorithm which uses three statistics:

Percentage of physical memory used;
CPU utilisation;
Queue job limit.

[Editor's Note: CONNECT:Queue is no longer being actively promoted by Sterling Commerce (SC). Although SC is committed to supporting its existing CONNECT:Queue customers, SC is attempting to upgrade them, as well as sell to new prospects, a much more robust job scheduling and workload balancing software called JP1. JP1 is manufactured by Hitachi and SC is presently the sole distributor of this product in the U.S.]

3.2.3 CS1/JP1

URL http://www.zoosoft.com/jp1/sysmanhu.html

To be completed.⁽¹⁾

3.2.4 Load Balancer

URL http://www.unison.com/main-menu/products/operations/loadbalancer/LoadBalancer.html

Load Balancer attempts to optimise the use of computer resources by distributing workloads to available UNIX systems across the network. Load Balancer determines availability based on the:

Capacity and current workload of each computer.
Resources required by the submitted job.
User defined and other constraints.

Load Balancer tries to increase the overall job throughput by making use of idle computer resources. At the same time, it attempts to prevent systems becoming overloaded by distributing the load evenly across the available computers.

Major Features

No modification to applications or kernel.
Central control and administration.
Policy and constraint options.
Job/application distribution criteria include:
- CPU Speed.
- Current load average.
- Time of day.
- RAM and Swap space.
- CPU type (HP, IBM, etc.).
- Presence and absence of interactive user(s).
- Other user-defined criteria.
Robust and scalable.
Works with both batch jobs and interactive applications.
Security features are built-in to restrict unauthorised access.

3.2.5 LoadLeveler

URL http://www.rs6000.ibm.com/

LoadLeveler is a job scheduler that distributes jobs to a cluster of workstations and/or to nodes of a multi-processor machine [13]. LoadLeveler decides when and how a batch job is run based on preferences set up by the user and system administrator. Users communicate with LoadLeveler using a few simple LoadLeveler commands, or by using the LoadLeveler GUI. Jobs are submitted to LoadLeveler via a command file, which is much like a UNIX script file. Jobs are held in the queue until LoadLeveler can allocate the required resources to run the job. Once the job has completed, LoadLeveler will (optionally) notify the user. The user does not have to specify what machines to run on, as LoadLeveler chooses the appropriate machines.

Features

Distributed job scheduler.
Serial/parallel batch and interactive workload.
Workload balancing.
Heterogeneous workstation support.
Compatibility with NQS.
Central point of control.
Scalability (queues and master schedulers).

When a job is scheduled, its requirements are compared to all the resources available to LoadLeveler. Job requirements might be a combination of memory, disk space, architecture operating system and application programs. LoadLeveler's central manager collects resource information and dispatches the job as soon as it locates the suitable resources. Load Leveler accepts shell scripts written for NQS so that jobs can be run under LoadLeveler or under NQS-based systems. LoadLeveler also provides a user or system-initiated checkpoint/restart capability for certain types of Fortran or C jobs linked to the LoadLeveler libraries.

Interactive session support - LoadLeveler's interactive session support feature allows remote TCP/IP applications to connect to the least loaded cluster resource.

Individual control - Users can specify to LoadLeveler when their workstation resources are available and how they are to be used.

Central control - From a system management perspective, LoadLeveler allows a system administrator to control all the jobs running on a cluster. Job and machine status are always available, providing administrators with the information needed to make adjustments to job classes and changes to LoadLeveler controlled resources.

Scalability - As workstations are added, LoadLeveler automatically scales upward so the additional resources are transparent to the user.

LoadLeveler has a command line interface and a Motif-based GUI. Users can:

Submit and cancel jobs.
Monitor job status.
Set and change job priorities.

3.2.6 Load Sharing Facility (LSF)

URL http://www.platform.com/products/overview.html

LSF is a distributed load sharing and batch queuing software package for heterogeneous Unix environments. LSF manages job processing by providing a transparent, single view of all hardware and software resources, regardless of systems in the cluster. LSF supports batch, interactive and parallel jobs and manages these jobs on the cluster making use of idle workstations and servers.

Specifically, LSF provides the following:

Batch job queuing and scheduling.
Interactive job distribution across the network.
Load balancing across hosts.
Transparent access to heterogeneous resources across network.
Site resource sharing policy enforcement.
Network-wide load monitoring.

Other LSF Features

Comprehensive load and resource information.
Transparent remote execution. The same execution environment (user ID, umask, resource limits, environment variables, working directory, task exit status, signals, terminal parameters, etc.) are maintained when jobs are sent to execute remotely.
Batch and interactive processing.
Sequential and parallel processing.
Heterogeneous operation.
Application Programming Interface (API) - allowing software be developed for new distributed applications.
Fault tolerance - LSF is available as long as one host is up and no job is lost even if all hosts are down.
Site policy options - provides configuration options for sites to customise their own resource sharing policies.
Job accounting data and analysis tools.

3.2.7 Network Queuing Environment (NQE)

URL http://www.cray.com/PUBLIC/product-info/sw/nqe/nqe30.html

NQE [14] provides a job management environment for the most popular workstation by distributing jobs to the most appropriate Unix systems available in a heterogeneous network, allowing users to share resources. NQE is compatible with Network Queuing System (NQS) software, but has a functionality that exceeds the basic NQS capability. NQE has the following features:

Automatic load balancing across an entire network.
Parallel Virtual Machine (PVM) support.
Job-dependency capability that allows users to specify the criteria that must be met before a specific job may be initiated.
Automatic job routing to systems that match user-specified attributes.
Unattended and secure file transfer to/from any client using the File Transfer Agent (FTA).
Batch job submission from remote locations, using the NQE client software.
Compatibility with other Unix systems running public domain versions of NQS.
Security features and multiple authentication mechanisms.
Motif GUI for network system and NQE job, installation, and on-line documentation.
Automatic placement of jobs in wait queues for systems not available.
Multiple job and administration logs.
Application programming interfaces (APIs) for customising applications to take advantage of NQE's functionality.
World Wide Web interface to submit NQE jobs, view job status, and receive back output.

3.2.8 Task Broker

URL http://www.hp.com:80/wsg/ssa/task.html

Task Broker is a software tool that attempts to distribute computational tasks among heterogeneous UNIX-system-based computer systems. Task Broker performs its computational distribution without requiring any changes to the application. Task Broker will relocate a job and its data according to rules set up at initialisation. The other capabilities provided by Task Broker include:

The ability to balance computational load among a group of computers systems.
Intelligent targeting of specialised jobs so that they run on the most appropriate server for the specific task. For example, a graphics application may be most efficiently run on a machine with a graphics accelerator - the user need have no machine-specific knowledge.
DCE inter-operability.
Task Broker automates the distribution of computations by:
- Gathering machine-specific knowledge from the user.
- Analysing machine-specific information and selecting the most available server.
- Connecting to a selected server via telnet, rsh (remote shell), or crp (create remote process).
- Copying program and data files to a selected server via ftp or NFS.
- Invoking applications over the network.
- Copying the resulting data files back from the server via ftp or NFS.

Each of the above steps is done automatically by Task Broker without the user needing to be aware of, or having to deal with, the details of server selection and data movement.

Task Broker Features:

Runs without requiring application code to be recompiled.
Adds servers and clients to a network simply by hooking them to a LAN, updating the central configuration file to include the new client server, and starting the Task Broker daemon.
Provides an accounting of services used on a given server.
Allows limiting of the number of applications running on a computer at any one time. This prevents degradation of server performance.
Permits control of the use of computers as servers. For example, users can specify that their workstations be accessed only at night.
GUI provides a visual interface to most of the Task Broker command set and configuration information. In addition, task status monitoring and control are provided for the user.
An on-line, context-sensitive help system.

3.3 Research Packages

3.3.1 Batch

URL none

This software [15] is designed to manage multiple batch job queues under the control of a daemon process. This daemon controls the batch jobs through the use of the BSD job control signals, while the client programs, batch, baq, and barm provide for submission, examination and removal of jobs, respectively.

The capabilities include:

Start and suspend running jobs at specific times and/or days.
Start and suspend running jobs at specific machine loads.
Start and suspend jobs based on whether someone is using the console.
Logging to the system log at varying levels. Statistics and database information are also available via signals sent to the daemon process.
Accounting of jobs for each queue.
Remote queues. Queues may be localised to a specific machine.
Load balanced queues. Given a set of hosts supporting the same queue, the machines (batch daemons) decide amongst themselves which host will run the current job, based on load averages, currently running jobs and other details.
Job runtime limitations.
Run jobs with specific nice values.
Limited console user control over the spooled jobs to minimise the effect of batch jobs on interactive response during work hours.
Ancillary programs for job and queue inspection, removal or manipulation.
A GUI (using the forms library) is also available to the job status and removal tools.

3.3.2 Computing Center Software (CCS)

URL http://www.uni-paderborn.de/pcpc/ccs/

The Computing Center Software [16 & 17] is itself a distributed software package running on the front-end of an MPP system. The Mastershell (MS), which runs on a front-end workstation, is the only user interface to CCS. It offers a limited environment for creating Virtual Hardware Environments (VHE) and running applications in interactive or batch mode.

The Port-Manager (PM) is a daemon that connects the MS, Queue-Manager (QM) and the Machine-Managers (MM) together. The MS can be started manually by the user or automatically by the surrounding Unix system as the user's default login shell. In either case a connection to the PM is established and data identifying the user is transferred. The PM uses this information to initiate a first authorisation, on failure the user session is aborted immediately. Otherwise, the user has the whole command language of the MS at his/her disposal.

If a user requests a VHE consisting of a number of processors in a certain configuration and wants exclusive usage for one hour. The number of VHEs a user can handle simultaneously is only restricted by the limitations of the metacomputer and the restrictions set up by the administrator or given by the operating system. When a VHE is ordered from the MS side the PM checks the user's limitations first, i.e. the maximum number and kind of resources allowed for the requesting user or project. If the request validation is successful, the VHE is sent to the QM.

The QM administers several queues. Depending on priority, time, resource requirements, and the application mode (batch or interactive), a queue for the request is chosen. If the scheduler of the QM decides that a certain VHE should be created, it is sent to the PM. The PM configures the request in co-operation with appropriate MMs and supervises the time limits. In addition, the PM generates operating and accounting data.

The user is allowed to start arbitrary applications within his VHE. The upper level of the three level optimisation policy used by CCS, corresponds to an efficient hardware request scheduling (QM). The mid-level maps requests onto the metacomputer (PM) and the third level handles the system dependent configuration software, optimising request placements onto the system architectures (MMs). This three level policy leads to a high level load balancing within the metacomputer.

3.3.3 Condor

URL http://www.cs.wisc.edu/condor/

Condor [18 & 19] is a software package for executing batch type jobs on workstations which would otherwise be idle. Major features of Condor are automatic location and allocation of idle machines, checkpointing and the migration of processes. All of these features are achieved without any modification to the underlying Unix kernel. It is not necessary for users to change their source code to run with Condor, although programs must be specially linked with Condor libraries.

The Condor software monitors the activity on all the participating workstations in the local network. Those machines which are determined to be idle are placed into a resource pool. Machines are then allocated from the pool for the execution of jobs. The pool is a dynamic entity -- workstations enter when they become idle, and leave again when they get busy.

Design Features

The user only needs to relink their code to run under Condor.
The local execution environment is preserved for remotely executing processes.
The Condor software is responsible for locating and allocating idle workstations.
"Owners" of workstations have complete authority over their own machines.
Jobs are guaranteed to complete eventually.
Local diskspace is not used by remotely executing Condor job.
Condor works outside the kernel, and is compatible with BSD and other Unix derivatives.

3.3.4 Distributed Job Manager (DJM)

URL http://www.msc.edu/msc/docs/djm/

DJM is a job scheduling system designed to allow the use of massively parallel processor (MPP) systems more efficiently. DJM provides a comprehensive set of management tools to help administrators utilise MPP systems effectively and efficiently.

Main features:

Pack jobs into the MPP in order to minimise the number of idle processing elements (PEs). DJM uses a packing algorithm that attempts to select the best physical location and order that the jobs should be run on the MPP.
Ensure that jobs complete. DJM forecasts future job activity to prevent jobs from being killed because of scheduled events, such as a maintenance shutdown.
Select jobs to run based on local site policy. DJM can be configured to determine the order in which jobs are run, including the time of day, the class of user, and the size of the job.
Reserve the MPP for real-time applications, heterogeneous applications, or live demonstrations.
Manage access to the MPP by both interactive and batch users.
Runs transparently underneath NQS (Network Queuing System).

3.3.5 Distributed Queuing System (DQS 3.X)

URL http://www.scri.fsu.edu/~pasko/dqs.html

DQS 3.1 [20] is an experimental Unix based queuing system being developed at the Supercomputer Computations Research Institute (SCRI) at The Florida State University. DQS development is sponsored by the United States Department Of Energy. DQS is designed as a management tool to aid in computational resource distribution across a network. DQS provides architecture transparency for both users and administrators across a heterogeneous environment.

Some features of DQS

GUI-based interface.
GUI-based accounting.
Parallel Make Utility.
Parallel package support: PVM, TCG/MSG, P4/P5 and MPI (planned).
Interactive support.
Scheduler based on Queue-Complexes.
No single point of failure, redundant queue masters are supported.

Qmon - Qmon is a GUI to DQS based on X/Xt. The top module has menus for executing DQS commands and other utility functions. An icon window displays the current states of the queue and, finally, a text output window to record the response of DQS commands launched from qmon.

Qusage - This is a Xt-based accounting package provided by DQS. Accounting information in a variety of forms can be retrieved via Qusage, it features on-line help and Postscript output. All accounting information is stored in one place, making retrieval of accounting information quick and easy.

Dmake - Distributed Make is a generic parallel make utility designed to speed up the process of compiling large packages. Dmake is designed for use with DQS, but can also be easily used as a standalone parallel make utility (separate from DQS). Dmake was developed with simplicity in mind, no daemons or other modifications to network configurations are required.

3.3.6 Extensible Argonne Scheduler System (EASY)

URL http://info.mcs.anl.gov/Projects/sp/scheduler/scheduler.html

[Editor's Note: the next article update will include the new EASYLL, a combination of EASY and LoadLeveler.]

The goals of EASY [21], Argonne National Laboratory's job scheduler, are fairness, simplicity, and efficient use of the available resources. These goals are in conflict, but the scheduler is designed to be a compromise. Users will be able to request a set of nodes for any type of use. In order to maintain the quality of machine access, the scheduler provides a single point of access, submit. This program allows users to queue both interactive and batch access jobs. When resources are available, the user is notified by the scheduler and at that time has exclusive access to the number of nodes requested. Having exclusive access to the nodes allows the user to have optimum cache performance and use of all available memory and /tmp disk space. This type of access allows users to run benchmarks at any time and also to predict how long it will take for their job to complete. Having exclusive access is essential so that users can predict wall-clock run time for their jobs when they submit them to the scheduler.

While there are currently no limits to the number or size of jobs that can be submitted, the scheduler uses a public algorithm to determine when batch or interactive time is actually provided. Any modifications to this algorithm will be made public. Argonne has also implemented an allocation policy as a separate part of the scheduler. The intent of the policy is to ensure all users some set amount of resource time and to prevent people from using more than their share of resources.

3.3.7 Far - A Tool for Exploiting Spare Workstation Capacity

URL http://www.liv.ac.uk/HPC/farHomepage.html

This project [22] is being carried out by the Computing Services Department of the University of Liverpool and is funded by the JISC New Technologies Initiative. The far project has developed a software tool to facilitate the exploitation of the spare processing capacity of Unix workstations. The initial aims of the project are to develop a system which would:

Detect lightly loaded workstations.
Provide automatic load balancing.
Allow a flexible workstation access policy.
Allow users to request dedicated clusters of free workstations.
Support a general PVM service.

far provides an environment in which the user could rlogin to, or issue a command via the Unix commands at or rsh which would be run automatically on the most lightly-loaded workstation in the network. The implementation of the system is based on a managed database of current workstation usage which could be inspected to find a suitable workstation. far also supports the exploitation of a network for running message passing parallel programs, i.e. as a loosely-coupled distributed-memory parallel computer. Functionality, such as checkpointing and process migration, has been deliberately omitted from far. far release 1.0 has the following features:

A managed database of workstation loads at any given instant.
Ability to run remote commands on the most lightly loaded workstations.
Support for PVM and other message passing libraries.
System administrator control of remote access to workstations.
Facilities for the dynamic configuration of workstation clusters.
A set of system administrator controls.

3.3.8 Generic Network Queuing System (GNQS)

URL http://www.shef.ac.uk/uni/projects/nqs/

The Networked, Unix based queuing system, NQS [23 & 24], was developed under a US government contract by the National Aeronautics and Space Administration (NASA). NQS was designed and written with the following goals in mind:

Provide for the full support of both batch and device requests. Here a batch request is defined as a shell script containing commands not requiring the direct services of some physical device. In contrast, a device request is defined as a set of independent instructions requiring the direct services of a specific device for execution (e.g. a line printer request).
Support all of the resource quotas enforceable by the underlying Unix kernel implementation that are relevant to any particular batch request, and its corresponding batch queue.
Support the remote queuing and routing of batch and device requests throughout the network of machines running NQS.
Modularise all of the request scheduling algorithms so that the NQS request schedulers can be easily modified on an installation by installation basis, if necessary.
Support queue access restrictions whereby the right to submit a batch or device request to a particular queue can be controlled, in the form of a user and group access list for any queue.
Support networked output return, whereby the stdout and stderr files of any batch request can be returned to a possibly remote machine.
Allow for the mapping of accounts across machine boundaries.
Provide a friendly mechanism whereby the NQS configuration on any particular machine can be modified without having to resort to the editing of obscure configuration files.
Support status operations across the network so that a user on one machine can obtain information relevant to the NQS on another machine, without requiring the user to login on the target remote machine.
Provide a design for the future implementation of file staging, whereby several files or file hierarchies can be staged in or out of the machine that eventually executes a particular batch request.

NQS (modified by Monsanto) has been superseded by Generic NQS 3.4, which in turn is being further developed and supported by the University of Sheffield - URL http://www.shef.ac.uk/uni/projects/nqs/

3.3.9 Multiple Device Queuing System (MDQS)

URL ftp://ftp.arl.mil/arch/

The Multiple Device Queuing System (MDQS) [25] is designed to provide Unix with a fully functional, modular, and consistent queuing system. The MDQS system was designed with portability, expandability, robustness, and data integrity as key goals.

MDQS is designed around a central queue which is managed by a single privileged daemon. Requests, delayed or immediate, are queued by non-privileged programs. Once queued, requests can be listed, modified or deleted. When the requested device or job stream becomes available, the daemon executes an appropriate server process to handle the request. Once activated, the request can still be canceled or restarted if needed.

MDQS can serve as a delayed-execution/batch. MDQS provides the system manager with a number of tools for managing the queuing system. Queues can be created, modified, or deleted without the loss of requests. MDQS recognises and supports both multiple devices per queue and multiple queues per device by mapping input for a logical device to an appropriate physical output device. Anticipating the inevitable, MDQS also provides for crash recovery.

The MDQS system has been developed at the U.S. Army, Ballistics Research Laboratory to support the work of the laboratory and is available to other Unix sites upon request.

3.3.10 Portable Batch System (PBS)

URL http://www.nas.nasa.gov/NAS/Projects/pbs/

The Portable Batch System (PBS) project [26] was initiated to create a flexible, extensible batch processing system to meet the unique demands of heterogeneous computing networks. The purpose of PBS is to provide additional controls over initiating or scheduling execution of batch jobs, and to allow routing of those jobs between different hosts.

PBS's independent scheduling module allows the system administrator to define what types of resources, and how much of each resource, can be used by each job. The scheduling module has full knowledge of the available queued jobs, running jobs, and system resource usage. Using one of several procedural languages, the scheduling policies can easily be modified to suit the computing requirements and goals of any site. The batch system allows a site to define and implement policy as to what types of resources and how much of each resource can be used by different jobs. PBS also provides a mechanism which allows users to specify unique resources required for a job to complete.

A Forerunner of PBS was Cosmic NQS, which was also developed by the NAS program. It became the early standard for batch system under Unix. However Cosmic NQS had several limitations and it was difficult to maintain and enhance.

PBS Supported Provides:

User commands - for manipulating jobs, i.e. submit, deletion, status, etc.
Operator commands - for setting up, starting and stopping PBS.
Administration commands - for managing PBS, i.e. setting up, start/stop queues.
Applications Programming Interface - design applications to run with PBS.
Resource Management - for manipulating PBS to best use the resources available.
Job routing - the process of moving jobs from one destination to another.
Job initiation - the process of selecting and running certain jobs.
File staging - process of managing disk space and moving files to and from the system executing the job.
Job tracking - monitoring the status of jobs submitted remotely.
History File - log maintained by PBS of job resource usage statistics.
Error detection and recovery.
Client software can be installed on all cluster workstations.

3.3.11 The Prospero Resource Manager (PRM)

URL http://nii-server.isi.edu/gost-group/products/prm/

The Prospero Resource Manager (PRM) [27] supports the allocation of processing resources in large distributed systems, enabling users to run sequential and parallel applications on processors connected by local or wide-area networks. PRM has been developed as part of the Distributed Virtual Systems Project at the Information Sciences Institute of the University of Southern California.

PRM enables users to run sequential or parallel jobs on a network of workstations. Sequential jobs may be off loaded to lightly loaded workstations while parallel jobs can make use of a collection of workstations. PRM supports the message passing libraries CMMD and a PVM interface (V3.3.5). PRM also supports terminal and file I/O activity by its tasks, such as keyboard input, printing to a terminal or access to files that may be on a filesystem not mounted by the host on which the task is running. Further more, the components of an application may span multiple administrative domains and hardware platforms, without imposing the responsibility of mapping individual components to nodes on the user.

PRM selects the processors on which the jobs will run, starts the job, supports communication between the tasks that make up the job, and directs input and output to and from the terminal and files on the user's workstation. At the job level, location transparency is achieved through a dynamic address translation mechanisms that translate task identifiers to physical workstation addresses.

PRM's resource allocation functions are distributed across three entities: the system manager, the job manager, and the node manager. The system manager controls access to a collection of processing resources and allocates them to jobs as requested by job managers. Large systems may employ multiple system managers, each managing a subset of resources. The job manager is the principal entity through which a job acquires processing resources to execute its tasks. The job manager acquires resources from one or more system managers and initiates tasks on these workstations through the node manager. A node manager runs on each workstation in the PRM environment. It initiates and monitors tasks on the workstation on which it is running.

3.3.12 Qbatch

URL http://gatekeeper.dec.com/pub/usenet/comp.sources.misc/volume25/QBATCH/

QBATCH is a queued batch processing system for Unix. Each queue consists of a file containing information about the queue itself, and about all jobs currently present in the queue. When the program qp is run for a given queue, it will fork a child process for each job in the queue in turn, and wait for it to complete. If there are no jobs present in the queue, qp will wait for a signal from one of the support programs, which will 'tell' it that another job has joined the queue, or that it should terminate.

Features:

There can be as many queues as the system can support. Queues are named, and run asynchronously. The processing of jobs running in one queue are totally independent of those in any other (subject of course to the independence of potentially shared resources such as data files and devices).

Queues can be started and stopped independently.
The priority of all jobs running are defined when the queue is created.
Jobs running in a queue consist of shell scripts. The jobs run with the uid and gid of the submittor, and run in the environment ruling when the job was submitted.
stdout and stderr of running jobs is redirected to a 'monitor'. This will normally be the queue monitor, but may be specified for individual jobs.
The status and contents of queues can be listed at any time.
If qbatch is configured in the system init script, all queues which were running when the system went down will be restarted. Any jobs present in the queues will also be restarted.
Facilities:
- Inform user when job is complete.
- Cancel jobs in queues, or kill running jobs.
- Change the running order of jobs in queues.
- Suspend and resume processing of jobs.
- Start and stop the queue process engine at any time.
- Sizable submissions to a queue, and re-enable them.
- Stop a queue, and wait until the current job has finished.
- List the status and contents of a queue.
- Repeat (or kill and repeat) the currently running job in a queue.
- Examine or edit jobs in a queue, and examine the monitors.
- Monitor (and perhaps account) the times taken by jobs in the queue.
- Prevent CPU hogging jobs from swamping the system.
- Separate queue for large jobs that is started and stopped by cron out of working hours.
- Protect sensitive applications and data with file and directory permissions.

| <- PREV | Index | Next -> |
NHSE Review^TM: Comments · Archive · Search
NHSE: Software Catalog · Roadmap

Lowell W Lutz (lwlutz@rice.edu) NHSE Review^TM WWWeb Editor