Cluster Performance

Performance measures have been obtained with the High-Performance Linpack Benchmark.

Performance

Rack
#
# of
Nodes
  # of
Cores
  DP FP
factor
  Clock
Speed
  R_peak*
0 30 x 2 x 2 x 1.80 Ghz = 216.0
1 15 x 4 x 2 x 2.40 Ghz = 288.0
2 32 x 8 x 4 x 2.33 Ghz = 2385.9
  77   376           2889.9

* R_peak and R_max are values in GFLOPS. R_max has been scaled from non-full cluster tests and is not representative. R_peak is simply the theoretical processing power of a processor multiplied by the number of cores.

Settings

320 cores, 1GB memory per core
Optimal performance at 80% memory usage:
#entries = sqrt(#cores * mem * 0.8 / 8) = 185 kentries

NB = 320
N = 320 * x = y


HPLinpack benchmark input file
Innovative Computing Laboratory, University of Tennessee
HPL.out output file name (if any)
6 device out (6=stdout,7=stderr,file)
1 # of problems sizes (N)
40000 Ns
3 # of NBs
44 88 132 NBs
1 # of process grids (P x Q)
2 Ps
8 Qs
16.0 threshold
3 # of panel fact
0 1 2 PFACTs (0=left, 1=Crout, 2=Right)
1 # of recursive stopping criterium
8 NBMINs (>= 1)
1 # of panels in recursion
2 NDIVs
1 # of recursive panel fact.
2 RFACTs (0=left, 1=Crout, 2=Right)
1 # of broadcast
1 BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
1 # of lookahead depth
1 DEPTHs (>=0)
2 SWAP (0=bin-exch,1=long,2=mix)
80 swapping threshold
0 L1 in (0=transposed,1=no-transposed) form
0 U in (0=transposed,1=no-transposed) form
1 Equilibration (0=no,1=yes)
8 memory alignment in double (> 0)

Leave a Reply