7.6. OpenCL kernel caching#

OpenCL kernels are cached on hard disk on multiple levels during a MIRGE-Com execution. This has the advantage of reducing the compilation time of kernels when running the same driver multiple times.

The following sections discuss MIRGE-Com-related packages that use caching.


The following bash code can be used to remove all disk caches used by MIRGE-Com on Linux and MacOS:

$ rm -rf $XDG_CACHE_HOME/pytools/pdict* ~/.cache/pytools/pdict* ~/Library/Caches/pytools/pdict*  $XDG_CACHE_HOME/pyopencl ~/.cache/pyopencl  ~/Library/Caches/pyopencl $POCL_CACHE_DIR $XDG_CACHE_HOME/pocl ~/.cache/pocl ~/.nv/ComputeCache $CUDA_CACHE_PATH

7.6.1. Loopy#

loopy stores the source of generated PyOpenCL kernels and their invokers in $XDG_CACHE_HOME/pytools/pdict-*-loopy by default. You can export LOOPY_NO_CACHE=1 to disable caching. See here for details.


loopy uses pytools.persistent_dict.PersistentDict for caching. PersistentDict also keeps an in-memory cache.


When $XDG_CACHE_HOME is not set, the cache dir defaults to ~/.cache on Linux and ~/Library/Caches/ on MacOS.

7.6.2. PyOpenCL#

pyopencl caches in $XDG_CACHE_HOME/pyopencl (kernel source code and binaries returned by the OpenCL runtime) and $XDG_CACHE_HOME/pytools/pdict-*-pyopencl (invokers, generated source code) by default. You can export PYOPENCL_NO_CACHE=1 to disable caching. See here for details.


PyOpenCL does not cache kernel binaries in memory by default. To keep the compiled version of a kernel in memory, simply retain the pyopencl.Program or pyopencl.Kernel objects. Loopy’s loopy.LoopKernel already holds handles to compiled pyopencl.Kernel objects.


PyOpenCL uses clCreateProgramWithSource on the first compilation and caches the OpenCL binary it retrieves. The second time the same source is compiled, it uses clCreateProgramWithBinary to hand the binary to the CL runtime (such as PoCL). This can lead to different caching behaviors on the first three compilations depending on how the CL runtime itself performs caching.

7.6.3. PoCL#

PoCL stores compilation results (LLVM bitcode and shared libraries) in $POCL_CACHE_DIR or $XDG_CACHE_HOME/pocl by default. You can export POCL_KERNEL_CACHE=0 to disable caching. See here for details.


When $POCL_CACHE_DIR and $XDG_CACHE_HOME are not set, PoCL’s cache dir defaults to ~/.cache/pocl on Linux and MacOS.

7.6.4. CUDA#

CUDA stores binary kernels in ~/.nv/ComputeCache (on Linux only, we do not support CUDA devices on MacOS) by default. You can export CUDA_CACHE_DISABLE=1 to disable caching, and select a different cache directory with CUDA_CACHE_PATH. See here for details.


The CUDA JIT cache is disabled by default on Lassen, i.e., CUDA_CACHE_DISABLE=1 is set by default. Source: email by J. Gyllenhaal on 03/12/2020.