Parallel benchmark on multi-core CPUs
From Gerris
This benchmark uses the the parallel Bénard–von Kármán Vortex Street example. The problem size is small which makes good parallel performance difficult to reach.
Contents |
popinet-new: Intel(R) Core(TM)2 Quad CPU Q9400 @2.66GHz, 64-bits
- Ubuntu 10.04 LTS 64-bits
- Linux popinet-new 2.6.32-34-generic #77-Ubuntu SMP Tue Sep 13 19:39:17 UTC 2011 x86_64 GNU/Linux
- Gerris2D version 2011-10-13
MPI versions:
- Open MPI
- 1.4.1-2
Open MPI
| #CPUs | Relative speedup |
|---|---|
| 1 | 1 |
| 2 (load-balanced) | 2.3 |
| 4 (load-balanced) | 3.27 |
fitzroy: IBM Power 575 4.7 GHz
| #CPUs | Relative speedup |
|---|---|
| 1 | 1 |
| 2 (load-balanced) | 1.92 |
| 4 (load-balanced) | 2.32 |
Parameter file
Large outputs and movie generation were turned off, the single-CPU parameter file is:
8 7 GfsSimulation GfsBox GfsGEdge {} {
Time { end = 15 }
Solid (x*x + y*y - 0.0625*0.0625)
RefineSolid 6
VariableTracer {} T
Init {} { U = 1 }
AdaptVorticity { istep = 1 } { maxlevel = 6 cmax = 1e-2 }
AdaptGradient { istep = 1 } { maxlevel = 6 cmax = 1e-2 } T
SourceViscosity 0.00078125
EventBalance { istep = 1 } 0.1
OutputTime { istep = 10 } stderr
OutputTime { istep = 1 } balance
OutputBalance { istep = 1 } balance
OutputProjectionStats { istep = 10 } stderr
OutputTiming { start = end } stderr
OutputSimulation { start = end } end.gfs
}
GfsBox {
left = Boundary {
BcDirichlet U 1
BcDirichlet T { return y < 0. ? 1. : 0.; }
}
}
GfsBox {}
GfsBox {}
GfsBox {}
GfsBox {}
GfsBox {}
GfsBox {}
GfsBox { right = BoundaryOutflow }
1 2 right
2 3 right
3 4 right
4 5 right
5 6 right
6 7 right
7 8 right



