Brick Library 0.1
Performance-portable stencil datalayout & codegen
|
Test cases are categorized as the folders.
\single
) for different compute modelsingle-cpu
single-cuda
single-opencl
single-sycl
single-mpi
requires MPI for layout benchmark\weak
)weak-cpu
weak-cuda
\strong
)strong-cpu
strong-cuda
Following is description of each of the case.
All single node will compile into <build_dir>/single/*
as executables without the prefix single-
, such as single-cpu
will be built as <build_dir>/single/cpu
.
Each single node experiment accept no commandline arguments. To change the stencils that it is computing refer to corresponding file in /stencils
. To change the domain size N
so that the total domain is $N^3$ modify the macro #define N 64
in /stencils/stencils.h
.
Weak scaling support fixed domain per node or fixed global domain decomposed into one subdomain per node.
Compiled into <build_dir>/weak/*
as executables without the weak-
prefix, such as weak-cpu
will be built as <build_dir>/weak/cpu
.
Each weak scaling experiment supports the following commandline arguments:
-d Int,Int,Int
set global domain size-s Int,Int,Int
set per-rank subdomain size-I Int
number of iterations to take average-b
downsize number of ranks to perfect 2 exponentialFor example mpirun -np 4 <build_dir>/weak/cpu -d 512,512,512
will distribute domain of $512^3$ to 4 mpi ranks.
Strong scaling support 2-level decomposition where global domain is decomposed into fixed-sized subdomains indexed using z-mort. These subdomains are then distributed to different MPI rank based on index. Each rank thus may have more than one subdomains. It supports the following commandline arguments:
-d Int
changing domain size to $Int^3$-s Int
changing the subdomain size $Int^3$-I Int
number of iterations to take average-v
enable validation for strong-cpu