7.6. OpenCL kernel caching¶
OpenCL kernels are cached in memory and on hard disk on multiple levels during a MIRGE-Com execution. This has the advantage of reducing the compilation time of kernels when running the same driver multiple times.
The following sections discuss MIRGE-Com-related packages that use caching, with a focus on configuring the disk-based caching.
Note
The following bash code can be used to remove all disk caches used by MIRGE-Com on Linux and MacOS:
$ rm -rf $XDG_CACHE_HOME/pytools/pdict* ~/.cache/pytools/pdict*
~/Library/Caches/pytools/pdict* $XDG_CACHE_HOME/pyopencl
~/.cache/pyopencl ~/Library/Caches/pyopencl $POCL_CACHE_DIR
$XDG_CACHE_HOME/pocl ~/.cache/pocl ~/.nv/ComputeCache $CUDA_CACHE_PATH
Note
The following bash code can be used to disable all disk caches:
$ export LOOPY_NO_CACHE=1
$ export PYOPENCL_NO_CACHE=1
$ export POCL_KERNEL_CACHE=0
$ export CUDA_CACHE_DISABLE=1
Note
Disabling disk caching for a specific package only affects
that particular package. For example, disabling disk caching for loopy
does not affect the caching behavior of pyopencl or PoCL.
7.6.1. Loopy¶
loopy stores the source of generated PyOpenCL kernels and their
invokers in $XDG_CACHE_HOME/pytools/pdict-*-loopy by default. You can export
LOOPY_NO_CACHE=1 to disable caching. See here
for details.
Note
loopy uses pytools.persistent_dict.PersistentDict
for caching. PersistentDict also keeps an
in-memory cache.
Note
When $XDG_CACHE_HOME is not set, the cache dir defaults to
~/.cache on Linux and ~/Library/Caches/ on MacOS.
7.6.2. PyOpenCL¶
pyopencl caches in $XDG_CACHE_HOME/pyopencl (kernel source
code and binaries returned by the OpenCL runtime) and
$XDG_CACHE_HOME/pytools/pdict-*-pyopencl (invokers, generated source code)
by default. You can export PYOPENCL_NO_CACHE=1 to disable caching. See here
for details.
Note
PyOpenCL does not cache kernel binaries in memory by default. To keep the
compiled version of a kernel in memory, simply retain the
pyopencl.Program or pyopencl.Kernel objects. Loopy’s
loopy.LoopKernel already holds handles to compiled
pyopencl.Kernel objects.
Note
PyOpenCL uses clCreateProgramWithSource on the first compilation and
caches the OpenCL binary it retrieves. The second time the same source
is compiled, it uses clCreateProgramWithBinary to hand the binary
to the CL runtime (such as PoCL). This can lead to different caching
behaviors on the first three compilations depending on how the CL runtime
itself performs caching.
7.6.3. PoCL¶
PoCL stores compilation results (LLVM bitcode and shared
libraries) in $POCL_CACHE_DIR or $XDG_CACHE_HOME/pocl by default. You
can export POCL_KERNEL_CACHE=0 to disable caching. See here for details.
Note
When $POCL_CACHE_DIR and $XDG_CACHE_HOME are not set, PoCL’s cache
dir defaults to ~/.cache/pocl on Linux and MacOS.
7.6.4. CUDA¶
CUDA stores binary kernels in ~/.nv/ComputeCache (on Linux only, we do
not support CUDA devices on MacOS) by default. You can
export CUDA_CACHE_DISABLE=1 to disable caching, and select a different
cache directory with CUDA_CACHE_PATH. See here
for details.
Warning
The CUDA JIT cache is disabled by default on Lassen, i.e.,
CUDA_CACHE_DISABLE=1 is set by default. Source: email by
J. Gyllenhaal on 03/12/2020.