IOSTATS(1)    

NAME
       IOstats  -  Produce  report  from I/O detail trace records
       about I/O operations by request type

SYNOPSIS
       IOstats  [-processor  number]  [-fileID  id]   [-paramFile
       params]
            [-browserFile browseOut] tracefile

DESCRIPTION
       IOstats  generates  a  report  of application I/O activity
       summarized by I/O request type  from  detailed  I/O  trace
       records  in  the  input  SDDF  trace file.   The necessary
       trace event records are produced by the I/O  extension  to
       the Pablo trace library by default whenever the I/O exten-
       sion has been initialized and individual  I/O  calls  have
       been instrumented.

       As IOstats is running, it periodically displays the number
       of input trace packets  (records)  processed  to  standard
       error.   The  report  output  is directed to standard out.
       Several paragraphs of text describing the report  contents
       are  included  after  the actual I/O activity information.
       An expanded version of the descriptive text is included in
       the section "THE REPORT" below.

       If  the  input  file does not include any I/O detail trace
       records, the report will be generated  without  error  but
       will  contain  all  zeros.   For  the  report  to  work as
       intended, there must be I/O detail trace  records  in  the
       input  SDDF file.  If the programmer elects not to collect
       I/O detail trace records for the entire application execu-
       tion, the report must be interpreted with that in mind.

OPTIONS
       -processor number
              Only  I/O  activity occurring on the specified pro-
              cessor is included in the report.  A  comment  con-
              taining the processor number is written to standard
              error when IOstats begins execution. The  processor
              number  is also included in the report itself after
              the heading "Reported Processor:".

              The default behavior is  to  include  I/O  activity
              from all processors.

       -fileID id
              Only  I/O  activity  associated  with the specified
              file, as identified  by  id,  is  included  in  the
              report.   A comment containing the file ID is writ-
              ten to standard error when  IOstats  begins  execu-
              tion.  The  file ID and file name are also included
              in the report itself after the  headings  "Reported
             avoid this file ID to file name mismatch problem.

              The default behavior is to include I/O activity for
              all files in the report.

       -paramFile params
              A  parameter file, params,  may be used to override
              the default configurations for the  minimum,  maxi-
              mum,  and bin sizes found in the report histograms.
              A comment containing the name of the parameter file
              is  written  to  standard error when IOstats begins
              execution.

              This option allows the user to tune  the  histogram
              parameters  to  suit  the  characteristics of their
              input trace data.  See the section "USING A PARAME-
              TER  FILE" below for more details on the format and
              contents of the parameter file.

       -browserFile browserOut
              When specified, this option causes IOstats to  pro-
              duce  an SDDF file, browserOut, containing a subset
              of the reported information in a format that can be
              loaded  and  viewed with the Pablo Browser program.
              See "THE BROWSER DATA" section below for details on
              the information written to the SDDF file.

              Note that the Pablo Brower program is currently not
              distributed outside the Pablo research group due to
              limited portability across platforms.

BEFORE RUNNING THE PROGRAM
       The  application  to  be studied must be instrumented with
       the I/O extension to the Pablo trace  library  and  should
       NOT call the function disableIOdetail().  The instrumented
       application is run and one or more trace files are  gener-
       ated  - the number of trace files depends on the number of
       processors used to run the application.

       If there are multiple trace files, they should  be  merged
       with  the  MergePabloTraces(1) program to produce a single
       trace file for the execution  including  information  from
       all processors.  It is possible to run IOstats on unmerged
       trace files, but the output will only contain  information
       for the single processor reported in the input trace file.

       If you intend to use the -fileID  option  to  report  only
       accesses to individual files, the program SyncIOfileIDs(1)
       should be run on the input trace file to  synchronize  IDs
       for  files that were opened more than once by the applica-
       tion.  The  file  generated  by  SyncIOfileIDs  ending  in
       ".syncFiles"  can  be  used as input to IOstats.  The file
       ending in ".syncFiles.map" can be  used  to  identify  the
       file  ID  of  the particular file whose I/O you are inter-
       ested in viewing.

       If I/O detail trace records were not produced for all  I/O
       operations  in  the application, either because some calls
       were not instrumented or because detail tracing  was  dis-
       abled  for  part of the execution, the report will include
       only those operations for which detail records  were  pro-
       duced and should be interpreted with that in mind.

THE REPORT
       The  report summarizes information from detailed I/O event
       trace records.  Every I/O operation for  the  processor(s)
       and  file(s)  in  the input tracefile that is not filtered
       out by a command line  -fileID  or  -processor  switch  is
       treated  separately  -  there  is  no attempt to correlate
       operations that  take  place  concurrently  on  individual
       nodes  of  a multiprocessor system.  Said another way, the
       processor and file  ID  information  in  the  trace  event
       records is used for filtering purposes, but is not consid-
       ered beyond that as the statistics are generated for  this
       report.

       The  First  and Last I/O Operation timestamps are based on
       I/O requests from every processor included in the  report.
       These  values  give a general feel for the wall-clock time
       over which the I/O reporting takes place.

       The report is divided into  sections  by  operation  type:
       Open,  Close,  Read, Seek, Write, Flush, and Miscellaneous
       I/O.  In addition, for traces gathered  on  Intel  Paragon
       systems,  there  are  sections  for  the operations Global
       Open, Asynchronous Read, Asynchronous Write, IO  Wait  and
       IO  Done,  IO  Mode,  and  Lsize.  The last section of the
       report gives totals for All I/O Operations reported  indi-
       vidually in the earlier sections.

       The  information  reported  for  the  individual operation
       types in their respective sections is  very  similar.   An
       operation  count  and total time is given in each section,
       along with the mean, standard deviation,  variance,  mini-
       mum,  maximum,  and  histogram for the operation durations
       (in seconds) and the time between operations of  the  same
       type  (also  in  seconds).  For Read and Write operations,
       additional  information  about  bytes  transferred,  bytes
       transferred over total time, and statistical summaries for
       Bytes Transferred and  Bytes/Second  based  on  individual
       requests  are  provided.   The Seek operation section con-
       tains a statistical summary  of  bytes  traversed.    When
       asynchronous  operations  are  used, the bytes transferred
       are reported in the  Asynchronous  Read  and  Asynchronous
       Write  sections  of  the output.  The following paragraphs
       explain further the meaning of the reported values.

       The operation count reports the number of times an  opera-
       tion  occurred  on  any  processor included in the report.
       For example, say the source program issued a  single  read
       request  which  is  executed  on  16  nodes  of a parallel
       machine.  The report will show a count of 16  read  opera-
       tions  when  run  on a file with trace data collected from
       all 16 nodes.  That is, each I/O request on each processor
       is  treated  separately  even  though a single call in the
       program source caused the "multiple" requests.

       Following this line of  reason,  sections  for  operations
       which  involve  byte transfer (reads, writes, asynchronous
       reads, asynchronous writes) report  the  total  number  of
       bytes  involved  on  any processor included in the report.
       Say our previous read operation requested 4 bytes of data,
       it  executed  on  16  nodes  so ( 4 * 16 ) = 64 bytes were
       transferred by that read request.

       When operation durations are reported, each  operation  on
       each  processor  is  again treated separately, even though
       they may overlap in wall-clock time.  Again using our read
       example,  let's say all processors start reading at time n
       seconds and finish at time n+2 seconds.   The  total  time
       for  the  read operations would be ( 16 * (n+2 - n) ) = 32
       seconds.  The mean read duration would  be  2,  and  there
       would  be  no deviation from the mean.  If our program did
       very little except this read, it's quite possible that  in
       wall-clock  time the total program execution may take less
       than 32 seconds.

       The "Bytes Xferred/Total Time" for this example is ( 64  /
       32  ) = 2 bytes/second.  Note that this is not the machine
       throughput which would be ( 64 / 2 )  =  32  bytes/second.
       (This throughput value is not included in the report.)

       The report also gives the bytes/second based on individual
       read requests.  For our read example,  every  request  was
       for 4 bytes and took 2 seconds so the mean is 2 bytes/sec-
       ond and there is no deviation from that.

       Let's change our example a bit and  say  each  read  still
       requested  4  bytes  of  data,  but  8 nodes finished in 2
       seconds and 8 finished in 4 seconds.  The total time would
       be  (  8*2  +  8*4 ) = 48 seconds and the "Bytes Xferred /
       Total Time" is ( 64 / 48 )  =  1.333  bytes/second.   Here
       machine  throughput might be calculated as ( 64 / 4 ) = 16
       bytes/second.  For the modified example, the  bytes/second
       based  on  individual read requests would be ( 4 / 2 ) = 2
       bytes/second for 8 nodes and ( 4 / 4 )  =  1  bytes/second
       for  8  nodes yielding a mean of 1.5, a standard deviation
       of .516, and a variance of .267.

       Now let's change the example  further  and  say  that  the
       reads  do  not  begin  at  the  same time on each node but
       instead there is a 3 second delay  between  the  start  of
       each  read.   The  cryptic ascii diagram below attempts to
       show this scenario... nodes are labeled 0  through  f  and
       appear  under  the timeline where they would be performing
       their read operations.

                        1         2         3         4
        Time  0123456789012345678901234567890123456789012345678
        Nodes 00 11 22 33 44 55 66 77 8888  aaaa  cccc  eeee
                                         9999  bbbb  dddd  ffff

       Here we see that 48  seconds  elapses  on  the  wall-clock
       while  the  reads  take  place.   But, no reads are taking
       place during eight of those seconds (2, 5,  8,  ...)   and
       the  machine  throughput is open to interpretation.  Do we
       say the throughput is ( 64 / 48 ) = 1.333 bytes/second, or
       should  the idle seconds be subtracted out yielding ( 64 /
       (48 - 8 ) ) = 1.6 bytes/second, or should we somehow  par-
       tition  the  requests where multiple nodes are doing reads
       separately?  We really don't know why  the  read  requests
       are  offset by 3 seconds - should that offset be "charged"
       to the I/O system?  Difficulties such as  these,  combined
       with  the  inability  to reconstruct from a trace file the
       program source that generated read operations across  mul-
       tiple  nodes  highlight  why no machine throughput numbers
       appear on the report.

       Another statistic given for each reported I/O operation is
       the  time  between  XXX  (where XXX is some I/O operation)
       operations  in  seconds.   When  multiple  processors  are
       included  in  the  report,  this value represents the time
       between any XXX operation on any node that is included  in
       the report (in seconds).  In the first and second examples
       above where all reads began at the same time on all nodes,
       the  mean  would  be  0.  In the third example, where each
       read began 3 seconds after the read on another  node,  the
       mean  would  be  3  seconds.  None of the examples deviate
       from the mean.

       Finally, for systems which include Asynch Read  and  Write
       information,  the  durations  reported  are  for the asyn-
       chronous read or  asynchronous  write  calls  and  do  not
       include the actual time required to transfer the requested
       bytes.

USING A PARAMETER FILE
       The IOstats program has default  configuration  parameters
       controlling  the minimums, maximums, and bin sizes used in
       the histograms included in the report.  The user may over-
       ride  these  defaults for some or all of the histograms by
       using the -paramFile command line option.

       The parameter file is an ASCII  SDDF  file  whose  records
       provide  histogram  configuration  information  which will
       override the minimum, maximum, and bin size defaults  com-
       piled  into  the  IOstats  executable.   All configuration
       records have the name "initialize IO  histogram"  and  the
       value  of  the first field "statistic" controls which his-
       togram to configure.  A sample parameter file follows:

       SDDFA
       #1:
       // "IOstats" "Histogram initialization"
       "initialize IO histogram" {
         // "Recognized Statistics"       "Reported as"
         // "time between IO"             "Time between I/O Operations"
         // "open durations"              "Open Operation Durations"
         // "time between opens"          "Time between Open Operations"
         // "close durations"             "Close Operation Durations"
         // "time between closes"         "Time between Close Operations"
         // "read durations"              "Read Operation Durations"
         // "time between reads"          "Time between Read Operations"
         // "read sizes"                  "Read Operation Bytes Transferred"
         // "read throughput"             "Bytes/Second based on individual Read
s"
         // "seek durations"              "Seek Operation Durations"
         // "time between seeks"          "Time between Seek Operations"
         // "seek sizes"                  "Seek Operation Bytes Traversed"
         // "write durations"             "Write Operation Durations"
         // "time between writes"         "Time between Write Operations"
         // "write sizes"                 "Write Operation Bytes Transferred"
         // "write throughput"            "Bytes/Secnd based on individual Write
s"
         // "flush durations"             "Flush Operation Durations"
         // "time between flushes"        "Time between Flush Operations"
         // "misc i/o durations"          "Misc I/O Operation Durations"
         // "time between misc i/o"       "Time between Misc I/O Operations"
         // "global open durations"       "Global Open Operation Durations"
         // "time between global opens"   "Time between Global Open Operations"
         // "asynch read durations"       "Asynch Read Operation Durations"
         // "time between asynch reads"   "Time between Asynch Read Operations"
         // "asynch read sizes"           "Asynch Read Operation Bytes Transf...
"
         // "asynch write durations"      "Asynch Write Operation Durations"
         // "time between asynch writes"  "Time between Asynch Write Operations"
         // "asynch write sizes"          "Asynch Write Operation Bytes Trans...
"
         // "iowait durations"            "IO Wait and IO Done Operation Dura...
"
         // "time between iowaits"        "Time between IO Wait and IO Done O...
"
         // "iomode durations"            "IO Mode Operation Durations"
         // "time between iomodes"        "Time between IO Mode Operations"
         // "lsize durations"             "Lsize Operation Durations"
         // "time between lsizes"         "Time between Lsize Operations"
         char    "statistic"[];
         // "Minimum" "Minimum bin value"
         double  "minimum";
         // "Maximum" "Maximum bin value"
         double  "maximum";
         // "Bin Size" "Size of bin"
         double  "bin size";
       };;

       "initialize IO histogram" {
         [30] { "read sizes" },
         0, 1024, 256
         };;

       "initialize IO histogram" {
         [30] { "write sizes" },
         1024, 10240, 1024
         };;

       This parameter file contains two data records which  over-
       ride the default values for two of the histograms produced
       by the IOstats program.  First, a minimum value  of  0,  a
       maximum  value of 1024, and a bin size of 256 will be used
       in the  histogram  "Read  Operation  Bytes  Transferred:".
       Second, a minimum value of 1024, a maximum value of 10240,
       and a bin size of 1024  will  be  used  in  the  histogram
       "Write Operation Bytes Transferred:".

       The  bin  size  should divide evenly into (maximum - mini-
       mum), and ( (maximum - minimum) / bin size  )  yields  the
       number  of "bounded" bins appearing in the histogram.   In
       addition to the "bounded" bins, there  are  underflow  and
       overflow bins to report values that fall outside the mini-
       mum/maximum range selected.

       The comments for the field "statistic" in  the  descriptor
       for  the  "initialize  IO  histogram"  record  contain the
       strings that should be used in the data records  to  over-
       ride all possible histogram default configurations.

THE BROWSER DATA
       The  browser  SDDF  file  contains histogram data for each
       type of I/O operation ( IO_OP_NAME =  Open,  Close,  Read,
       Seek,  Write,  Flush,  Misc I/O, Global Open, Asynch Read,
       Asynch Write, IO Wait and IO Done,  IO  Mode,  Lsize)  and
       summary information across all operation types.
              o IO_OP_NAME Operation Durations (seconds)
              o Time between IO_OP_NAME Operations (seconds)
              o Time between I/O Operations (seconds)
              o Read Operation Bytes Transferred:
              o Seek Operation Bytes Traversed:
              o Write Operation Bytes Transferred:
              o Bytes/Second based on individual Read Requests
              o Bytes/Second based on individual Write Requests
              o Count of I/O Operations
              o Total Time of I/O Operations
              o Mean Duration of I/O Operations

KNOWN PROBLEMS
       For  files  accessed in a global mode, the "Seek Operation
       Bytes Traversed:" will often be incorrect.  In particular,
       on  Intel  Paragon systems, the reported seek bytes should
       not be trusted for files with an iomode of M_LOG,  M_SYNC,
       M_RECORD,  or  M_GLOBAL.   The  I/O extension to the Pablo
       instrumentation library attempts to minimize the  overhead
       incurred  in  gathering file pointer information, and does
       not track file  pointer  positioning  correctly  when  the
       activity  on  one processor affects the file pointer posi-
       tion on another processor.  An attempt  will  be  made  to
       address this problem in the next release.

FILES
       /Templates/IOstats.parameters
              A  sample parameter file to use with the -paramFile
              option.

SEE ALSO
       FileRegionIOstats(1),  IOstatsTable(1),   IOtotalsByPE(1),
       MergePabloTraces(1), LifetimeIOstats(1), SyncIOfileIDs(1),
       TimeWindowIOstats(1)
       Ruth A. Aydt, A User's Guide to Pablo I/O Instrumentation
       Ruth A. Aydt, The Pablo Self-Defining Data Format

COPYRIGHT
       Copyright 1994-1996, The University of Illinois  Board  of
       Trustees.

AUTHOR
       Ruth A. Aydt, University of Illinois

Pablo Environment          Oct 16, 1996