Brick Library 0.1
Performance-portable stencil datalayout & codegen
|
Tuowen Zhao, Samuel Williams, Mary Hall, and Hans Johansen. Delivering performance-portable stencil computations on cpus and gpus using bricks. In 2018 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC), pages 59–70, 2018.
Tuowen Zhao, Protonu Basu, Samuel Williams, Mary Hall, and Hans Johansen. Exploiting reuse and vectorization in blocked stencil computations on cpus and gpus. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC '19, New York, NY, USA, 2019. Association for Computing Machinery.
Tuowen Zhao, Mary Hall, Hans Johansen, and Samuel Williams. Improving communication by optimizing on-node data movement with data layout. In Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '21, page 304–317, New York, NY, USA, 2021. Association for Computing Machinery.