7.6. OpenCL kernel caching#
OpenCL kernels are cached on hard disk on multiple levels during a MIRGE-Com execution. This has the advantage of reducing the compilation time of kernels when running the same driver multiple times.
The following sections discuss MIRGE-Com-related packages that use caching.
Note
The following bash code can be used to remove all disk caches used by MIRGE-Com on Linux and MacOS:
$ rm -rf $XDG_CACHE_HOME/pytools/pdict* ~/.cache/pytools/pdict*
~/Library/Caches/pytools/pdict* $XDG_CACHE_HOME/pyopencl
~/.cache/pyopencl ~/Library/Caches/pyopencl $POCL_CACHE_DIR
$XDG_CACHE_HOME/pocl ~/.cache/pocl ~/.nv/ComputeCache $CUDA_CACHE_PATH
7.6.1. Loopy#
loopy
stores the source of generated PyOpenCL kernels and their
invokers in $XDG_CACHE_HOME/pytools/pdict-*-loopy
by default. You can export
LOOPY_NO_CACHE=1
to disable caching. See here
for details.
Note
loopy
uses pytools.persistent_dict.PersistentDict
for caching. PersistentDict
also keeps an
in-memory cache.
Note
When $XDG_CACHE_HOME
is not set, the cache dir defaults to
~/.cache
on Linux and ~/Library/Caches/
on MacOS.
7.6.2. PyOpenCL#
pyopencl
caches in $XDG_CACHE_HOME/pyopencl
(kernel source
code and binaries returned by the OpenCL runtime) and
$XDG_CACHE_HOME/pytools/pdict-*-pyopencl
(invokers, generated source code)
by default. You can export PYOPENCL_NO_CACHE=1
to disable caching. See here
for details.
Note
PyOpenCL does not cache kernel binaries in memory by default. To keep the
compiled version of a kernel in memory, simply retain the
pyopencl.Program
or pyopencl.Kernel
objects. Loopy’s
loopy.LoopKernel
already holds handles to compiled
pyopencl.Kernel
objects.
Note
PyOpenCL uses clCreateProgramWithSource
on the first compilation and
caches the OpenCL binary it retrieves. The second time the same source
is compiled, it uses clCreateProgramWithBinary
to hand the binary
to the CL runtime (such as PoCL). This can lead to different caching behaviors on the first three compilations depending on how the CL runtime
itself performs caching.
7.6.3. PoCL#
PoCL stores compilation results (LLVM bitcode and shared libraries) in
$POCL_CACHE_DIR
or $XDG_CACHE_HOME/pocl
by default. You can export
POCL_KERNEL_CACHE=0
to disable caching. See here for details.
Note
When $POCL_CACHE_DIR
and $XDG_CACHE_HOME
are not set, PoCL’s cache
dir defaults to ~/.cache/pocl
on Linux and MacOS.
7.6.4. CUDA#
CUDA stores binary kernels in ~/.nv/ComputeCache
(on Linux only, we do
not support CUDA devices on MacOS) by default. You can
export CUDA_CACHE_DISABLE=1
to disable caching, and select a different
cache directory with CUDA_CACHE_PATH
. See here
for details.
Warning
The CUDA JIT cache is disabled by default on Lassen, i.e.,
CUDA_CACHE_DISABLE=1
is set by default. Source: email by
J. Gyllenhaal on 03/12/2020.