| <- HREF="ch1.html" Prev | Index | Next -> |
NHSE ReviewTM: Comments · Archive · Search
Perhaps more than any other user community, HPC programmers have
been faced with a recurring paradox:
One result is that with few exceptions, the software suites provided on recent parallel and clustered computer systems have included at least an interactive debugger, a tool measuring program timing characteristics, and a tool measuring one or more other types of program performance (e.g., message- passing, instruction counts, memory use, I/O). Additional performance analysis tools are widely available as shareware. Evidence shows, however, that HPC application developers simply aren't using the current generation of debugging and tuning tools. [19]
There are a number of reasons for this. At the Second Pasadena Workshop on System Software and Tools for Parallel Computing Systems, a working group was convened to address the issue of "Usability of HPC System Software and Tools". The group included users from research institutions and from third-party applications developers, as well as software tool developers from the HPC industry.
The users were asked to identify reasons why they don't use current software tools. Seven reasons dominated:
In turn, tool developers were asked why they don't appear to respond to user criticisms of system software and tools. They responded:
It is important to appreciate that the nature of HPC applications has changed over the last decade. As competition for HPC resources has increased, applications have come to revolve around the concept of "portability." This is a somewhat over-generalized term that actually encompasses three requirements. First, parallel programmers are concerned with the need to migrate existing codes successfully to new and better systems as they become available. The reality of today's parallel computing marketplace is that hardware and systems software change almost constantly. By the time an application is ready for production-level use, the platform for which it was developed will have been superceded. Alternatively, the best performance may be achievable only if the application can make use of multiple platforms (e.g., data filtering on an distributed-memory system, followed by intensive computation on an SMP, followed by visualization on a specialized workstation), in which case individual portions of the application may be migrated to different targets.
The skyrocketing popularity of network-based (heterogeneous or clustered) parallelism imposes another requirement for programming support. In some situations, such environments offer a mechanism for consuming so-called "wasted cycles" when machines are idle or under-utilized. At other sites, network-based systems provide alternate environments for executing parallel applications when the primary target machine is unavailable due to competing demands. Programmers are now demanding the ability to transport codes (i.e., port repeatedly without sacrificing performance) across a spectrum of computers or system configurations.
The third requirement is support for distributed development; that is, coding, compiling, and even debugging applications on systems other than the target parallel machine. This is crucial for the future of parallelism, since it decreases the competition for costly HPC resources by off-loading non-HPC tasks. Using portable languages and tools, users can develop and test applications on a serial workstation or a smaller, less expensive HPC system, later moving them to the final target platform.
The clear implication is that HPC tools need to be machine-independent - or at least available on multiple platforms. Like it or not, most HPC programmers will end up working on a number of very different machine platforms over the course of time. The investment in learning to use a tool (or other piece of system software) probably will not be warranted unless the tool is supported on more than one platform, and behaves in a consistent way across platforms. Such cross-platform and cross-vendor consistency can only be achieved through formal or informal standardization.
| <- HREF="ch1.html" Prev | Index | Next -> |
NHSE ReviewTM: Comments · Archive · Search