Parallel benchmark on multi-core CPUs

From Gerris

Revision as of 21:42, 13 October 2011; view current revision
←Older revision | Newer revision→
Jump to: navigation, search

This benchmark uses the the parallel Bénard–von Kármán Vortex Street example. The problem size is small which makes good parallel performance difficult to reach.

Contents

popinet-new: Intel(R) Core(TM)2 Quad CPU Q9400 @2.66GHz, 64-bits

  • Ubuntu 10.04 LTS 64-bits
  • Linux popinet-new 2.6.32-34-generic #77-Ubuntu SMP Tue Sep 13 19:39:17 UTC 2011 x86_64 GNU/Linux
  • Gerris2D version 2011-10-13

MPI versions:

Open MPI 
1.4.1-2

Open MPI

Image:balance-openmpi-new.png

#CPUs Relative speedup
1 1
2 (load-balanced) 2.3
4 (load-balanced) 3.27

fitzroy: IBM Power 575 4.7 GHz

Image:balance-fitzroy.png

#CPUs Relative speedup
1 1
2 (load-balanced) 1.92
4 (load-balanced) 2.32

Parameter file

Large outputs and movie generation were turned off, the single-CPU parameter file is:

8 7 GfsSimulation GfsBox GfsGEdge {} {
  Time { end = 15 }
  Solid (x*x + y*y - 0.0625*0.0625)
  RefineSolid 6
  VariableTracer {} T
  Init {} { U = 1 }
  AdaptVorticity { istep = 1 } { maxlevel = 6 cmax = 1e-2 }
  AdaptGradient { istep = 1 } { maxlevel = 6 cmax = 1e-2 } T
  SourceViscosity 0.00078125
  EventBalance { istep = 1 } 0.1
  OutputTime { istep = 10 } stderr
  OutputTime { istep = 1 } balance
  OutputBalance { istep = 1 } balance
  OutputProjectionStats { istep = 10 } stderr
  OutputTiming { start = end } stderr
  OutputSimulation { start = end } end.gfs
}
GfsBox {
  left = Boundary {
    BcDirichlet U 1
    BcDirichlet T { return y < 0. ? 1. : 0.; }
  }
}
GfsBox {}
GfsBox {}
GfsBox {}
GfsBox {}
GfsBox {}
GfsBox {}
GfsBox { right = BoundaryOutflow }
1 2 right
2 3 right
3 4 right
4 5 right
5 6 right
6 7 right
7 8 right

See also

Personal tools
communication