| <- PREV | Index | Next -> |
NHSE ReviewTM: Comments
· Archive
· Search
Software for managing clusters and environments that utilise clusters have become very popular in the last decade, as this review can verify. Anyone taking a brief look through this review will quickly establish that a large percentage of the packages described are interesting research projects that will probably go nowhere. But work on these projects will feed into other projects which will further our knowledge and understanding of the problems and needs associated with cluster computing.
The importance of CMS packages for the commercial world can be readily seen by the fact that most of the major computer vendors are now involved, or support, one or more of the packages available. It is also clear that the limited finances of many institutions have forced them to reassess how their present computing facilities are being used and is making them look at ways of utilising their resources more efficiently and effectively. There is clear evidence that both commercial and research/academic communities are becoming increasingly interested in CMS.
It is not clear that CMS is actually being used to take advantage of spare CPU cycles, but it is evident that large efforts are being expended in trying to increase the throughput on networks of workstations by load balancing the work that needs to be done. Thus increasing the overall throughput of the cluster.
An aim of this report has been to highlight the issues to be considered when choosing a CMS. This knowledge can then be cross-checked against the features provided by each CMS package. Choosing, the best CMS package for a particular site, is not the easiest task. A pragmatic look at a site's needs and a profile of typical applications that run must be made. The administrator of a site can then use these to best judge which package would be the most appropriate for his/her site.
It is clear that, without testing, just because a package supports a particular feature, it does not necessarily mean that its functionality will meet the demands of the administrator of a site. For example, a package may support parallel jobs, but it may be impossible to configure the queues to sensibly run the jobs - due to the need to wait for sufficient resources or for numerous other reasons.
In section 4.3 the nineteen CMS packages were assessed by comparing their functionality against the authors' highly desirable criteria list. This narrowed the list down to six, two public domain and four commercial packages. It was recommended that if finances permit, it would be wise to choose one of the commercial packages. The reason for this choice was that this path should minimise the amount of time and effort that is expended by staff on a site to understand, set up, install and run the package. The onus will be on the vendors to ensure a site has a smooth passage with the their software. It was also stated that the final decision as to which package to purchase would be an economic one, as each vendor is trying to sell software that provides the same cluster services and therefore price performance would be the deciding factor.
The CMS looked at in this review exhibited a number of rather important omissions, namely:
DQS and Codine are probably the most comprehensive, functional and commonly used CMS packages available. Both are available on most major platforms and support sequential and parallel jobs. Codine is a well supported commercial package, whereas DQS is still a research project and the level of support that may be needed cannot be guaranteed.
Many of the commercial CMS packages will provide better user support for one particular vendor, than the other platforms that is supposed to support equally.
Generic NQS, now supported by the University of Sheffield, is a very mature package which has been widely used by numerous large sites - CERN in particular. NQS protocols are also well established - packages that support these protocols exhibit additional functionality and an degree of compatibility.
Many interesting projects are emerging. In particular the resources and effort being put into NOW can only be admired. The NOW project [3] is addressing many of the key areas that are seen as problem areas with CMS. The WANE project [3], which is integrating a number of packages into one environment, has high ambitions and should be worth watching.
There does not seem to be enough emphasis on utilising commodity PCs. It is obvious that these are the most common machines available and that they are a huge untapped source of additional CPU power. Projects involving multi-tasking PC operating systems such as Windows-NT and Linux should certainly be encouraged.
The popularity of the WWW along with the increasingly functional and maturing tools [29] indicates that future successful cluster management systems will be based on this technology.
Presently no integrated WWW-based cluster management systems exists. But the fundamental infrastructure needed to produce such a system is already in-place. It should be a relatively simple exercise to use the WWW as a uniform interface to administer and run applications on a heterogeneous computing environment.
| <- PREV | Index | Next -> |
NHSE ReviewTM: Comments
· Archive
· Search
NHSE: Software Catalog
· Roadmap