7.6. OpenCL kernel caching¶
OpenCL kernels are cached in memory and on hard disk on multiple levels during a MIRGE-Com execution. This has the advantage of reducing the compilation time of kernels when running the same driver multiple times.
The following sections discuss MIRGE-Com-related packages that use caching, with a focus on configuring the disk-based caching.
Note
The following bash code can be used to remove all disk caches used by MIRGE-Com on Linux and MacOS:
$ rm -rf $XDG_CACHE_HOME/pytools/pdict* ~/.cache/pytools/pdict*
~/Library/Caches/pytools/pdict* $XDG_CACHE_HOME/pyopencl
~/.cache/pyopencl ~/Library/Caches/pyopencl $POCL_CACHE_DIR
$XDG_CACHE_HOME/pocl ~/.cache/pocl ~/.nv/ComputeCache $CUDA_CACHE_PATH
Note
The following bash code can be used to disable all disk caches:
$ export LOOPY_NO_CACHE=1
$ export PYOPENCL_NO_CACHE=1
$ export POCL_KERNEL_CACHE=0
$ export CUDA_CACHE_DISABLE=1
Note
Disabling disk caching for a specific package only affects
that particular package. For example, disabling disk caching for loopy
does not affect the caching behavior of pyopencl
or PoCL.
7.6.1. Loopy¶
loopy
stores the source of generated PyOpenCL kernels and their
invokers in $XDG_CACHE_HOME/pytools/pdict-*-loopy
by default. You can export
LOOPY_NO_CACHE=1
to disable caching. See here
for details.
Note
loopy
uses pytools.persistent_dict.PersistentDict
for caching. PersistentDict
also keeps an
in-memory cache.
Note
When $XDG_CACHE_HOME
is not set, the cache dir defaults to
~/.cache
on Linux and ~/Library/Caches/
on MacOS.
7.6.2. PyOpenCL¶
pyopencl
caches in $XDG_CACHE_HOME/pyopencl
(kernel source
code and binaries returned by the OpenCL runtime) and
$XDG_CACHE_HOME/pytools/pdict-*-pyopencl
(invokers, generated source code)
by default. You can export PYOPENCL_NO_CACHE=1
to disable caching. See here
for details.
Note
PyOpenCL does not cache kernel binaries in memory by default. To keep the
compiled version of a kernel in memory, simply retain the
pyopencl.Program
or pyopencl.Kernel
objects. Loopy’s
loopy.LoopKernel
already holds handles to compiled
pyopencl.Kernel
objects.
Note
PyOpenCL uses clCreateProgramWithSource
on the first compilation and
caches the OpenCL binary it retrieves. The second time the same source
is compiled, it uses clCreateProgramWithBinary
to hand the binary
to the CL runtime (such as PoCL). This can lead to different caching
behaviors on the first three compilations depending on how the CL runtime
itself performs caching.
7.6.3. PoCL¶
PoCL stores compilation results (LLVM bitcode and shared
libraries) in $POCL_CACHE_DIR
or $XDG_CACHE_HOME/pocl
by default. You
can export POCL_KERNEL_CACHE=0
to disable caching. See here for details.
Note
When $POCL_CACHE_DIR
and $XDG_CACHE_HOME
are not set, PoCL’s cache
dir defaults to ~/.cache/pocl
on Linux and MacOS.
7.6.4. CUDA¶
CUDA stores binary kernels in ~/.nv/ComputeCache
(on Linux only, we do
not support CUDA devices on MacOS) by default. You can
export CUDA_CACHE_DISABLE=1
to disable caching, and select a different
cache directory with CUDA_CACHE_PATH
. See here
for details.
Warning
The CUDA JIT cache is disabled by default on Lassen, i.e.,
CUDA_CACHE_DISABLE=1
is set by default. Source: email by
J. Gyllenhaal on 03/12/2020.