NHSE ReviewTM 1996 Volume Second Issue

Comparison of 3 HPF Compilers

| <- HREF="chap1.html" Prev | Index | Next -> |
NHSE ReviewTM: Comments · Archive · Search


Chapter 2 -- Evaluation and Evaluation Criteria

2.1 Introduction

HPF compilers, in general, are fairly "young" as compared to compilers for other standardized languages such as Fortran or C. As a result, there is not yet a common body of practice, widely accepted, that is a standard yardstick for comparing all HPF compilers. This chapter makes the obvious comparison of which defined HPF language features are supported by each reviewed compiler. Then the capabilities of each compiler in handling FORTRAN 77 or Fortran 90 features, particularly with respect to parallelism and parallelization are tabulated. Each compiler has some associated performance analysis tools and some extensions beyond Standard Fortran or HPF. Some of these are tabulated. Finally some subjective perceptions as to the relative strengths of each compiler are given, derived from some application codes that have been implemented in or converted to HPF at the CTC, and from some inspection of and interaction with CTC user's codes.

Later chapters show more details of the generated code from each compiler for a few small kernels of HPF code. These are discussed to give a better feel for the compilation behavior of each compiler, as perceptible to a user of the compiler.

A complication in forming evaluation criteria for these compilers is the current distinction between HPF and Subset HPF. Indeed, more confusion is added to any feature comparisons based on the full and Subset languages because a number of detailed additions and subtractions were made to the contents of the Subset during the interpretations and corrections that led from "HPF 1.0" of May 1993 to the current "HPF 1.1" of November 1994. (Note that the HPF Forum has reconsidered the status of the "Subset HPF" definition in that the soon to be adopted proposals for "HPF 2.0" do not use that form of two-level designation.)

It would be desirable in a comparative review paper such as this to present carefully measured data on, for instance, compilation duration timing, size of generated executable files, runtime execution duration timing, etc., for a well studied collection of benchmark kernels and whole applications. This has not been done for this paper at this time.

2.2 HPF Language Level -- Subset versus Full

None of these three compilers is completely compliant, or effective, for the full facilities of Fortran 90 combined with the full HPF language. Two compilers (xHPF and XL HPF) are designated as "Extended Subset HPF" compilers by their vendors, and each has significant extensions into facilities of the full language, and other departures from the Subset language. It is thus better to describe all these compilers by listing all the HPF features they will or will not process rather than to attempt to categorize them as "full" or "Subset".

2.3 HPF Language Features

The following table indicates some major aspects of HPF and briefly notes presence or absence of the feature in each of the compilers, or indicates restricted, related, or augmented facilities corresponding to the HPF feature.

Feature xHPF [2] pghpf [3], [4], [5] XL HPF [6], [7]
TEMPLATE, DISTRIBUTE, ALIGN yes yes yes
BLOCK or CYCLIC in N dimensions yes
(with default assumptions and keyword extensions)
yes yes
(default preference to N=2 if no PROCESSORS used with DISTRIBUTION)
BLOCK(k), CYCLIC(k) no yes partial ---
CYCLIC(k) is generally treated as CYCLIC(1)
PROCESSORS no (but checked for syntax) yes yes
DYNAMIC, REDISTRIBUTE, REALIGN no
(but proprietary use of DYNAMIC)
yes no
SEQUENCE, NOSEQUENCE yes yes yes
FORALL statement only statement and construct statement and construct
INDEPENDENT yes
(omits NEW clause, but is extended in semantics; see also many CAPR$ directives to control parallelization)
yes
(extended with ON HOME and REDUCTION clauses)
yes
Default Data Mapping for Arrays not named in Directives Replicated
(but note derivation of mappings via "... -Auto ..." parallelization and generated mappings)
Replicated Replicated
HPF Procedure Arguments and Data Mapping facilities Prescriptive mapping available Prescriptive and Transcriptive (inherited) mappings available Prescriptive and Descriptive mappings available
HPF Procedure Dummy Argument Data Mapping for Arrays not named in Directives Procedure dummy argument automatically inherits mapping of actual argument Local replication re-mapping of dummy arguments not explicitly mapped Local replication re-mapping of dummy arguments not explicitly mapped
HPF intrinsics and HPF_LIBRARY routines no yes only NUMBER_OF_PROCESSORS, PROCESSORS_SHAPE, HPF_ALIGNMENT, HPF_DISTRIBUTION, HPF_TEMPLATE (some with omissions from defined interfaces)
EXTRINSIC (a) omitting routine from translation generates EXTRINSIC-like call;
(b) proprietary directive available in caller
EXTRINSIC(HPF_LOCAL), EXTRINSIC(F77_LOCAL) EXTRINSIC(HPF), EXTRINSIC(HPF_LOCAL), EXTRINSIC(HPF_SERIAL)
HPF_LOCAL_LIBRARY routines no
(but proprietary support available for equivalents)
yes
(except for LOCAL_TO_GLOBAL)
yes
(but with omissions from defined interfaces)
Feature xHPF pghpf XL HPF


2.4 Other Capabilities

Beyond details of HPF that are handled or not, each compiler has its own manner of handling of FORTRAN 77 and Fortran 90 features that may or may not contribute to the parallel execution of a program. The following table briefly notes some of these capabilities.

Feature xHPF pghpf XL HPF
Generation of Parallel Code for Run-time Determined Number of Processors yes yes yes
Automatic Parallelization of DO Loops yes yes yes
Analysis and Parallelization over the entire Program Tree (Inter-procedural Analysis) yes no no
Automatic Derivation of Data Mappings (for User-declared Arrays) yes no no
Fortran 90 Feature Coverage Closely aligned to only that Fortran 90 required for Subset HPF: significant omissions from F90 are free-form syntax, MODULEs, derived types, pointers, data type parameters (KIND) Full Fortran 90, but there are limitations on recursion, character arrays, visibility control and use of some objects defined in modules, and the parallelization of programs using arrays of derived type or using pointers Full Fortran 90, but codes that use pointers, ENTRY statements, sub-objects as internal files, and many intrinsics that have a DIM argument if the argument is non-constant cannot be compiled as HPF codes (however they may be compiled as HPF_LOCAL or HPF_SERIAL codes)
Fortran I/O Handling Performed by "Processor 0" Performed by "Processor 0" in both HPF and HPF_LOCAL code; uses PGI-supplied data conversion support Performed by "Processor 0" in HPF, by all processors in HPF_LOCAL
Feature xHPF pghpf XL HPF


2.5 Tools and Extensions

Our experience at CTC with HPF indicates that performance tuning (including discovering for which parts of a code the compiler has actually generated effective parallelism) is more laborious than correctness debugging, regardless of the HPF product. Thus, even the nature of a compiler report indicating parallel execution (or not) is an important "tool". The following table indicates some of the facilities, tools, and extensions that bear on those issues (and a few more minor ones).

Tool or Extension xHPF pghpf XL HPF
Parallel Compilation Report Three levels available in listing file: summary report, annotated call and loop tree, annotated HPF source listing; also summary available to standard output Summary report of parallel execution compilation (indicated as "FORALL") available to standard output; also commented-insertions indicating "FORALL" compilation available in preserved intermediate Fortran file Two levels available in listing file: messages related to original source line numbers indicating absence of parallel execution, pseudo-Fortran source with the same messages inserted
Instrumented Run Time for Performance Analysis; Display Tool "-otpf" inserts time-recording calls at each procedure entry/exit and at head and tail of each loop: execution generates time information; postprocessor polytime generates annotated call-tree after run with times attributed to user computation, latency-plus-bandwidth-attributed communication time, communication wait time, and RTP overhead; an Xwindow GUI (FORGExplorer/ DMP) is available to sort report rows per any column and to display the relevant source code lines "-Mprof" inserts time-recording for either routine entry/exits or for each source-line; execution generates time information; postprocessor and Xwindow GUI visualizer pgprof displays information on time taken by each procedure executed or each line executed, numbers or sizes of messages sent or received at each line, and allows various sorting-s of the reports;
"-Mstats" in compilation command and "-stat" in the execution command line reports CPU time, memory statistics, message-passing statistics at the end of the run.
"-pg" links prof/gprof-instrumented runtime (asynchronous sampled timing); postprocessor and Xwindow GUI xprofiler permits examination of CPU utilization by each procedure; also the parallel environment and message-passing support has trace-file collection that can be visualized (using IBM PE VT tool) with a coordinated source-file browser in post-execution animation
Fortran Language Extensions Limited Connection Machine Fortran (cmf) handling: LAYOUT as DISTRIBUTE, non-F90 argument order in CSHIFT/ EOSHIFT/ RESHAPE, ARRAY attribute as DIMENSION, DATA attribute keyword, spelling of some intrinsic names, RANK intrinsic STRUCTURE/END STRUCTURE data type declaration, RECORD data instance declaration (uses STRUCTURE type), UNION and MAP keywords for STRUCTURE declarations; "Cray Pointer" syntax; nnn'O and nnn'Xconstants; limited Connection Machine Fortran (cmf) handling: non-F90 argument order in CSHIFT/ EOSHIFT/ RESHAPE, ARRAY attribute as DIMENSION, spelling of some intrinsic names, square brackets syntax in array constructor XL HPF has extensive extensions beyond Fortran 90 that range from the lexical (e.g., $ allowed in identifiers) to an "Integer POINTER type" plus an associated "LOC(...)" intrinsic generating addresses plus an "integer pointer assignment" statement: the details are not highlighted in the XL HPF Language Reference but can be ascertained from the current IBM XL Fortran Language Reference manual [8]
Vendor-Specific Directives Extensions Additional keywords in DISTRIBUTE: FULLBLOCK, SHRUNKBLOCK, FULLCYCLIC, SHRUNKCYCLIC; CHPF$ EXTRINSIC SAFE ...; "CAPR$"-form directives for more detailed control of automatic parallelization, communication scheduling, and decomposition: DO PAR [ON ...], DO NOPAR, IGNORE ALL INHIBITORS, IGNORE ... COM [ON ...], PARTITION ..., PARTITION_NOMOVE ... Additional keyword clauses for INDEPENDENT (anticipates HPF 2): ON HOME ..., REDUCTION ... @PROCESS allows "command-line" compiler options to have effect within source files; EJECT; ! ... SOURCEFORM ...
Integration with Other Tools Interactive parallelization with FORGExplorer/ DMP and xHPF can share identical program-analysis databases (see Performance Analysis report handling, above); all IBM SP PE tools work with generated FORTRAN 77 source (parallel debugger, trace analysis/visualization) Run time launch of per-node debugger (e.g., dbx); all IBM SP PE tools work with generated FORTRAN 77 source (parallel debugger, trace analysis/visualization) IBM SP PE tools work with HPF source: parallel debuggers see source statements but not yet distributed data, trace visualization can animate the HPF source during playback
Tool or Extension xHPF pghpf XL HPF


2.6 Subjective Evaluations

The experience of the author and colleagues at the CTC with the three compilers has led to a set of perceptions as to their relative strengths and weaknesses. This is clearly a subjective measure, and is colored strongly by the collection of codes that have been processed to date with these compilers. The following table summarizes these perceptions. It has been adapted from a Case Study presentation prepared at CTC and used extensively in training -- in particular, see the commentary in the Part 8 Discussion Layer, section 4.

Feature xHPF pghpf XL HPF
Style of Fortran Code Best Handled FORTRAN 77 DO Loops with scalar array references and modest use of reductions and with conditionals only in support of reductions; deep program trees with parallelizable loop at any higher level Fortran 90 data-parallel constructs; parallelism visible in single compilation units; extensive use of both HPF_LIBRARY intrinsics and Fortran 90 array elemental, manipulation, and reduction intrinsics Fortran 90 data-parallel constructs; parallelism visible in single compilation units; extensive use of Fortran 90 array elemental, manipulation, and reduction intrinsics
Ability to Generate Parallel Code from DO Loops Excellent where dependence analysis (performed on the entire program tree) can determine absence of loop-carried dependences Excellent where an anchor is visible for data-parallel execution as indicated by a distributed array as the left-hand-side of an assignment Excellent where an anchor is visible for data-parallel execution as indicated by a distributed array as the left-hand-side of an assignment
Ability to Recognize Privatizable Scalars and Reductions in Loops Excellent, including cases with conditionals that implement reductions such as MAXVAL Excellent, particularly with "auto-parallelization" command-line option and indicated distributed arrays in loops; also has extension of REDUCTION clause (anticipates HPF 2.0) for INDEPENDENT directive to handle "+", "*", ".or.", ".and.", ".neqv.", iand, ior, ieor, min, and max scalar reductions in loops Excellent
Ability to Process Connection Machine Fortran (CMF) Some automatic conversion and handling of LAYOUT directives and non-standard keywords and intrinsic spellings and argument order; severe limitation due to missing HPF_LIBRARY routines Only CSHIFT, EOSHIFT, and RESHAPE with non-standard argument order, and the ARRAY keyword and non-standard array constructor syntax is handled; LAYOUT directives are not handled; "CMF-inspired" HPF_LIBRARY routines are available No automatic handling of CMF; note absence of "CMF-inspired" HPF_LIBRARY computational routines
Feature xHPF pghpf XL HPF

Copyright © 1996


| <- HREF="chap1.html" Prev | Index | Next -> |
NHSE ReviewTM: Comments · Archive · Search


presberg@tc.cornell.edu
Last modified: Fri Jan 31, 1997