NHSE ReviewTM 1996 Volume First Issue

Cluster Management Software

| <- PREV | Index | Next -> |
NHSE ReviewTM: Comments · Archive · Search


Chapter 5 -- Conclusions

5.1 General Comments

The information collected and presented in this review is just a `snap-shot' of the CMS packages available and found, via the Internet, during the Summer of 1995. Without actually physically setting up, installing and testing these software packages on a number of different platforms it is impossible to objectively determine which is the best amongst them. Even if there was the time to do so, a number issues, such as ease of use, configurability, user support, etc., would also need to be assessed, making the whole exercise rather subjective and thus difficult to quantify.

Software for managing clusters and environments that utilise clusters have become very popular in the last decade, as this review can verify. Anyone taking a brief look through this review will quickly establish that a large percentage of the packages described are interesting research projects that will probably go nowhere. But work on these projects will feed into other projects which will further our knowledge and understanding of the problems and needs associated with cluster computing.

The importance of CMS packages for the commercial world can be readily seen by the fact that most of the major computer vendors are now involved, or support, one or more of the packages available. It is also clear that the limited finances of many institutions have forced them to reassess how their present computing facilities are being used and is making them look at ways of utilising their resources more efficiently and effectively. There is clear evidence that both commercial and research/academic communities are becoming increasingly interested in CMS.

It is not clear that CMS is actually being used to take advantage of spare CPU cycles, but it is evident that large efforts are being expended in trying to increase the throughput on networks of workstations by load balancing the work that needs to be done. Thus increasing the overall throughput of the cluster.

An aim of this report has been to highlight the issues to be considered when choosing a CMS. This knowledge can then be cross-checked against the features provided by each CMS package. Choosing, the best CMS package for a particular site, is not the easiest task. A pragmatic look at a site's needs and a profile of typical applications that run must be made. The administrator of a site can then use these to best judge which package would be the most appropriate for his/her site.

It is clear that, without testing, just because a package supports a particular feature, it does not necessarily mean that its functionality will meet the demands of the administrator of a site. For example, a package may support parallel jobs, but it may be impossible to configure the queues to sensibly run the jobs - due to the need to wait for sufficient resources or for numerous other reasons.

In section 4.3 the nineteen CMS packages were assessed by comparing their functionality against the authors' highly desirable criteria list. This narrowed the list down to six, two public domain and four commercial packages. It was recommended that if finances permit, it would be wise to choose one of the commercial packages. The reason for this choice was that this path should minimise the amount of time and effort that is expended by staff on a site to understand, set up, install and run the package. The onus will be on the vendors to ensure a site has a smooth passage with the their software. It was also stated that the final decision as to which package to purchase would be an economic one, as each vendor is trying to sell software that provides the same cluster services and therefore price performance would be the deciding factor.

5.2 Omissions in the CMS Packages

The CMS looked at in this review exhibited a number of rather important omissions, namely:

5.3 A Step-by-Step Guide to Choosing a CMS package

  1. Assess the needs of your site - class them as mandatory, desirable and useful.
  2. Produce a list of commonly used packages and applications that would be typical jobs on your designated cluster.
  3. Cross match the chosen criteria (see 1.) against the ones shown for the CMS packages being assessed.
  4. Contact the authors or vendors of the packages you have come up with and ask them for further details and contacts at reference sites.
  5. Read and digest details.
  6. Compose a list of questions to ask the reference site and contact reference site - maybe visit for demonstration and to ask further questions.
  7. Compose a list of questions for the authors/vendors and contact them.
  8. When satisfied with a particular package, negotiate a demonstration software license for 30-60 days.
  9. Install software on a designated test cluster - make full use of the User support provided to get an idea of their efficiency and effectiveness at solving your problems.
  10. Test the CMS package rigorously by running your typical applications (see 2).
  11. Test its fault tolerance and configurability.
  12. Assess the features, functionality and usability of the CMS and decide whether it fits your needs.
  13. Try to test alternative packages if unhappy.
  14. Purchase the package that fulfills your sites needs (obviously).

5.4 Some Personal Views

DQS and Codine are probably the most comprehensive, functional and commonly used CMS packages available. Both are available on most major platforms and support sequential and parallel jobs. Codine is a well supported commercial package, whereas DQS is still a research project and the level of support that may be needed cannot be guaranteed.

Many of the commercial CMS packages will provide better user support for one particular vendor, than the other platforms that is supposed to support equally.

Generic NQS, now supported by the University of Sheffield, is a very mature package which has been widely used by numerous large sites - CERN in particular. NQS protocols are also well established - packages that support these protocols exhibit additional functionality and an degree of compatibility.

Many interesting projects are emerging. In particular the resources and effort being put into NOW can only be admired. The NOW project [3] is addressing many of the key areas that are seen as problem areas with CMS. The WANE project [3], which is integrating a number of packages into one environment, has high ambitions and should be worth watching.

There does not seem to be enough emphasis on utilising commodity PCs. It is obvious that these are the most common machines available and that they are a huge untapped source of additional CPU power. Projects involving multi-tasking PC operating systems such as Windows-NT and Linux should certainly be encouraged.

5.5 The Future?

The popularity of the WWW along with the increasingly functional and maturing tools [29] indicates that future successful cluster management systems will be based on this technology.

Presently no integrated WWW-based cluster management systems exists. But the fundamental infrastructure needed to produce such a system is already in-place. It should be a relatively simple exercise to use the WWW as a uniform interface to administer and run applications on a heterogeneous computing environment.

Finally, it seems likely that the experience learned from the packages reviewed in this document will used to produced a WWW based system in the very near future.


| <- PREV | Index | Next -> |
NHSE ReviewTM: Comments · Archive · Search
NHSE: Software Catalog · Roadmap


Copyright © 1996 NHSE ReviewTM All Rights Reserved.
Lowell W Lutz (lwlutz@rice.edu) NHSE ReviewTM WWWeb Editor