Parallel benchmark on multi-core CPUs

From Gerris

(Difference between revisions)
Jump to: navigation, search
Revision as of 22:58, 4 February 2010
Popinet (Talk | contribs)

← Previous diff
Revision as of 21:42, 13 October 2011
Popinet (Talk | contribs)
(Updated with more recent openmpi and gerris versions)
Next diff →
Line 1: Line 1:
-This benchmark uses the the [http://gfs.sourceforge.net/examples/examples/cylinder.html#htoc5 parallel Bénard–von Kármán Vortex Street] example. Various implementations of MPI were tested, with and without load-balancing, on the following system:+This benchmark uses the the [http://gfs.sourceforge.net/examples/examples/cylinder.html#htoc5 parallel Bénard–von Kármán Vortex Street] example. The problem size is small which makes good parallel performance difficult to reach.
-* Intel(R) Core(TM)2 Quad CPU Q9400 @2.66GHz, 64-bits+= popinet-new: Intel(R) Core(TM)2 Quad CPU Q9400 @2.66GHz, 64-bits =
-* Ubuntu 9.10 64-bits+ 
-* Linux popinet 2.6.31-17-generic #54-Ubuntu SMP Thu Dec 10 17:01:44 UTC 2009 x86_64 GNU/Linux+* Ubuntu 10.04 LTS 64-bits
-* Gerris2D version 2010-01-29+* Linux popinet-new 2.6.32-34-generic #77-Ubuntu SMP Tue Sep 13 19:39:17 UTC 2011 x86_64 GNU/Linux
 +* Gerris2D version 2011-10-13
MPI versions: MPI versions:
-; MPICH1 : 1.2.7-9.1ubuntu1 (packages mpich-shmem-bin, libmpich-shmem1.0-dev),+; Open MPI : 1.4.1-2
-; MPICH2 : 1.2-1ubuntu1.1 (packages mpich2, libmpich2-dev, libmpich2-1.2),+
-; OpenMPI : +
-== MPICH1 ==+== Open MPI ==
-[[Image:balance-mpich1.png]]+[[Image:balance-openmpi-new.png]]
-== MPICH2 ==+{| border="1"
 +|-
 +! #CPUs
 +! Relative speedup
 +|-
 +| 1
 +| 1
 +|-
 +| 2 (load-balanced)
 +| 2.3
 +|-
 +| 4 (load-balanced)
 +| 3.27
 +|}
-[[Image:balance-mpich2.png]]+= fitzroy: IBM Power 575 4.7 GHz =
-== Parameter file ==+[[Image:balance-fitzroy.png]]
 + 
 +{| border="1"
 +|-
 +! #CPUs
 +! Relative speedup
 +|-
 +| 1
 +| 1
 +|-
 +| 2 (load-balanced)
 +| 1.92
 +|-
 +| 4 (load-balanced)
 +| 2.32
 +|}
 + 
 += Parameter file =
Large outputs and movie generation were turned off, the single-CPU parameter file is: Large outputs and movie generation were turned off, the single-CPU parameter file is:
Line 63: Line 92:
7 8 right 7 8 right
</pre> </pre>
 +
 += See also =
 +
 +* [[Parallel benchmark]]
 +* [[Parallel benchmark on other systems]]

Revision as of 21:42, 13 October 2011

This benchmark uses the the parallel Bénard–von Kármán Vortex Street example. The problem size is small which makes good parallel performance difficult to reach.

Contents

popinet-new: Intel(R) Core(TM)2 Quad CPU Q9400 @2.66GHz, 64-bits

  • Ubuntu 10.04 LTS 64-bits
  • Linux popinet-new 2.6.32-34-generic #77-Ubuntu SMP Tue Sep 13 19:39:17 UTC 2011 x86_64 GNU/Linux
  • Gerris2D version 2011-10-13

MPI versions:

Open MPI 
1.4.1-2

Open MPI

Image:balance-openmpi-new.png

#CPUs Relative speedup
1 1
2 (load-balanced) 2.3
4 (load-balanced) 3.27

fitzroy: IBM Power 575 4.7 GHz

Image:balance-fitzroy.png

#CPUs Relative speedup
1 1
2 (load-balanced) 1.92
4 (load-balanced) 2.32

Parameter file

Large outputs and movie generation were turned off, the single-CPU parameter file is:

8 7 GfsSimulation GfsBox GfsGEdge {} {
  Time { end = 15 }
  Solid (x*x + y*y - 0.0625*0.0625)
  RefineSolid 6
  VariableTracer {} T
  Init {} { U = 1 }
  AdaptVorticity { istep = 1 } { maxlevel = 6 cmax = 1e-2 }
  AdaptGradient { istep = 1 } { maxlevel = 6 cmax = 1e-2 } T
  SourceViscosity 0.00078125
  EventBalance { istep = 1 } 0.1
  OutputTime { istep = 10 } stderr
  OutputTime { istep = 1 } balance
  OutputBalance { istep = 1 } balance
  OutputProjectionStats { istep = 10 } stderr
  OutputTiming { start = end } stderr
  OutputSimulation { start = end } end.gfs
}
GfsBox {
  left = Boundary {
    BcDirichlet U 1
    BcDirichlet T { return y < 0. ? 1. : 0.; }
  }
}
GfsBox {}
GfsBox {}
GfsBox {}
GfsBox {}
GfsBox {}
GfsBox {}
GfsBox { right = BoundaryOutflow }
1 2 right
2 3 right
3 4 right
4 5 right
5 6 right
6 7 right
7 8 right

See also

Personal tools
communication