|
Brick Library 0.1
Performance-portable stencil datalayout & codegen
|
Test cases are categorized as the folders.
\single) for different compute modelsingle-cpusingle-cudasingle-openclsingle-syclsingle-mpi requires MPI for layout benchmark\weak)weak-cpuweak-cuda\strong)strong-cpustrong-cudaFollowing is description of each of the case.
All single node will compile into <build_dir>/single/* as executables without the prefix single-, such as single-cpu will be built as <build_dir>/single/cpu.
Each single node experiment accept no commandline arguments. To change the stencils that it is computing refer to corresponding file in /stencils. To change the domain size N so that the total domain is $N^3$ modify the macro #define N 64 in /stencils/stencils.h.
Weak scaling support fixed domain per node or fixed global domain decomposed into one subdomain per node.
Compiled into <build_dir>/weak/* as executables without the weak- prefix, such as weak-cpu will be built as <build_dir>/weak/cpu.
Each weak scaling experiment supports the following commandline arguments:
-d Int,Int,Int set global domain size-s Int,Int,Int set per-rank subdomain size-I Int number of iterations to take average-b downsize number of ranks to perfect 2 exponentialFor example mpirun -np 4 <build_dir>/weak/cpu -d 512,512,512 will distribute domain of $512^3$ to 4 mpi ranks.
Strong scaling support 2-level decomposition where global domain is decomposed into fixed-sized subdomains indexed using z-mort. These subdomains are then distributed to different MPI rank based on index. Each rank thus may have more than one subdomains. It supports the following commandline arguments:
-d Int changing domain size to $Int^3$-s Int changing the subdomain size $Int^3$-I Int number of iterations to take average-v enable validation for strong-cpu