658 lines
28 KiB
ReStructuredText
658 lines
28 KiB
ReStructuredText
============================================================
|
||
NEP 31 — Context-local and global overrides of the NumPy API
|
||
============================================================
|
||
|
||
:Author: Hameer Abbasi <habbasi@quansight.com>
|
||
:Author: Ralf Gommers <rgommers@quansight.com>
|
||
:Author: Peter Bell <pbell@quansight.com>
|
||
:Status: Draft
|
||
:Type: Standards Track
|
||
:Created: 2019-08-22
|
||
|
||
|
||
Abstract
|
||
--------
|
||
|
||
This NEP proposes to make all of NumPy's public API overridable via an
|
||
extensible backend mechanism.
|
||
|
||
Acceptance of this NEP means NumPy would provide global and context-local
|
||
overrides in a separate namespace, as well as a dispatch mechanism similar
|
||
to NEP-18 [2]_. First experiences with ``__array_function__`` show that it
|
||
is necessary to be able to override NumPy functions that *do not take an
|
||
array-like argument*, and hence aren't overridable via
|
||
``__array_function__``. The most pressing need is array creation and coercion
|
||
functions, such as ``numpy.zeros`` or ``numpy.asarray``; see e.g. NEP-30 [9]_.
|
||
|
||
This NEP proposes to allow, in an opt-in fashion, overriding any part of the
|
||
NumPy API. It is intended as a comprehensive resolution to NEP-22 [3]_, and
|
||
obviates the need to add an ever-growing list of new protocols for each new
|
||
type of function or object that needs to become overridable.
|
||
|
||
Motivation and Scope
|
||
--------------------
|
||
|
||
The primary end-goal of this NEP is to make the following possible:
|
||
|
||
.. code:: python
|
||
|
||
# On the library side
|
||
import numpy.overridable as unp
|
||
|
||
def library_function(array):
|
||
array = unp.asarray(array)
|
||
# Code using unumpy as usual
|
||
return array
|
||
|
||
# On the user side:
|
||
import numpy.overridable as unp
|
||
import uarray as ua
|
||
import dask.array as da
|
||
|
||
ua.register_backend(da) # Can be done within Dask itself
|
||
|
||
library_function(dask_array) # works and returns dask_array
|
||
|
||
with unp.set_backend(da):
|
||
library_function([1, 2, 3, 4]) # actually returns a Dask array.
|
||
|
||
Here, ``backend`` can be any compatible object defined either by NumPy or an
|
||
external library, such as Dask or CuPy. Ideally, it should be the module
|
||
``dask.array`` or ``cupy`` itself.
|
||
|
||
These kinds of overrides are useful for both the end-user as well as library
|
||
authors. End-users may have written or wish to write code that they then later
|
||
speed up or move to a different implementation, say PyData/Sparse. They can do
|
||
this simply by setting a backend. Library authors may also wish to write code
|
||
that is portable across array implementations, for example ``sklearn`` may wish
|
||
to write code for a machine learning algorithm that is portable across array
|
||
implementations while also using array creation functions.
|
||
|
||
This NEP takes a holistic approach: It assumes that there are parts of
|
||
the API that need to be overridable, and that these will grow over time. It
|
||
provides a general framework and a mechanism to avoid a design of a new
|
||
protocol each time this is required. This was the goal of ``uarray``: to
|
||
allow for overrides in an API without needing the design of a new protocol.
|
||
|
||
This NEP proposes the following: That ``unumpy`` [8]_ becomes the
|
||
recommended override mechanism for the parts of the NumPy API not yet covered
|
||
by ``__array_function__`` or ``__array_ufunc__``, and that ``uarray`` is
|
||
vendored into a new namespace within NumPy to give users and downstream
|
||
dependencies access to these overrides. This vendoring mechanism is similar
|
||
to what SciPy decided to do for making ``scipy.fft`` overridable (see [10]_).
|
||
|
||
The motivation behind ``uarray`` is manyfold: First, there have been several
|
||
attempts to allow dispatch of parts of the NumPy API, including (most
|
||
prominently), the ``__array_ufunc__`` protocol in NEP-13 [4]_, and the
|
||
``__array_function__`` protocol in NEP-18 [2]_, but this has shown the need
|
||
for further protocols to be developed, including a protocol for coercion (see
|
||
[5]_, [9]_). The reasons these overrides are needed have been extensively
|
||
discussed in the references, and this NEP will not attempt to go into the
|
||
details of why these are needed; but in short: It is necessary for library
|
||
authors to be able to coerce arbitrary objects into arrays of their own types,
|
||
such as CuPy needing to coerce to a CuPy array, for example, instead of
|
||
a NumPy array. In simpler words, one needs things like ``np.asarray(...)`` or
|
||
an alternative to "just work" and return duck-arrays.
|
||
|
||
Usage and Impact
|
||
----------------
|
||
|
||
This NEP allows for global and context-local overrides, as well as
|
||
automatic overrides a-la ``__array_function__``.
|
||
|
||
Here are some use-cases this NEP would enable, besides the
|
||
first one stated in the motivation section:
|
||
|
||
The first is allowing alternate dtypes to return their
|
||
respective arrays.
|
||
|
||
.. code:: python
|
||
|
||
# Returns an XND array
|
||
x = unp.ones((5, 5), dtype=xnd_dtype) # Or torch dtype
|
||
|
||
The second is allowing overrides for parts of the API.
|
||
This is to allow alternate and/or optimised implementations
|
||
for ``np.linalg``, BLAS, and ``np.random``.
|
||
|
||
.. code:: python
|
||
|
||
import numpy as np
|
||
import pyfftw # Or mkl_fft
|
||
|
||
# Makes pyfftw the default for FFT
|
||
np.set_global_backend(pyfftw)
|
||
|
||
# Uses pyfftw without monkeypatching
|
||
np.fft.fft(numpy_array)
|
||
|
||
with np.set_backend(pyfftw) # Or mkl_fft, or numpy
|
||
# Uses the backend you specified
|
||
np.fft.fft(numpy_array)
|
||
|
||
This will allow an official way for overrides to work with NumPy without
|
||
monkeypatching or distributing a modified version of NumPy.
|
||
|
||
Here are a few other use-cases, implied but not already
|
||
stated:
|
||
|
||
.. code:: python
|
||
|
||
data = da.from_zarr('myfile.zarr')
|
||
# result should still be dask, all things being equal
|
||
result = library_function(data)
|
||
result.to_zarr('output.zarr')
|
||
|
||
This second one would work if ``magic_library`` was built
|
||
on top of ``unumpy``.
|
||
|
||
.. code:: python
|
||
|
||
from dask import array as da
|
||
from magic_library import pytorch_predict
|
||
|
||
data = da.from_zarr('myfile.zarr')
|
||
# normally here one would use e.g. data.map_overlap
|
||
result = pytorch_predict(data)
|
||
result.to_zarr('output.zarr')
|
||
|
||
There are some backends which may depend on other backends, for example xarray
|
||
depending on `numpy.fft`, and transforming a time axis into a frequency axis,
|
||
or Dask/xarray holding an array other than a NumPy array inside it. This would
|
||
be handled in the following manner inside code::
|
||
|
||
with ua.set_backend(cupy), ua.set_backend(dask.array):
|
||
# Code that has distributed GPU arrays here
|
||
|
||
Backward compatibility
|
||
----------------------
|
||
|
||
There are no backward incompatible changes proposed in this NEP.
|
||
|
||
Detailed description
|
||
--------------------
|
||
|
||
Proposals
|
||
~~~~~~~~~
|
||
|
||
The only change this NEP proposes at its acceptance, is to make ``unumpy`` the
|
||
officially recommended way to override NumPy, along with making some submodules
|
||
overridable by default via ``uarray``. ``unumpy`` will remain a separate
|
||
repository/package (which we propose to vendor to avoid a hard dependency, and
|
||
use the separate ``unumpy`` package only if it is installed, rather than depend
|
||
on for the time being). In concrete terms, ``numpy.overridable`` becomes an
|
||
alias for ``unumpy``, if available with a fallback to the a vendored version if
|
||
not. ``uarray`` and ``unumpy`` and will be developed primarily with the input
|
||
of duck-array authors and secondarily, custom dtype authors, via the usual
|
||
GitHub workflow. There are a few reasons for this:
|
||
|
||
* Faster iteration in the case of bugs or issues.
|
||
* Faster design changes, in the case of needed functionality.
|
||
* ``unumpy`` will work with older versions of NumPy as well.
|
||
* The user and library author opt-in to the override process,
|
||
rather than breakages happening when it is least expected.
|
||
In simple terms, bugs in ``unumpy`` mean that ``numpy`` remains
|
||
unaffected.
|
||
* For ``numpy.fft``, ``numpy.linalg`` and ``numpy.random``, the functions in
|
||
the main namespace will mirror those in the ``numpy.overridable`` namespace.
|
||
The reason for this is that there may exist functions in the in these
|
||
submodules that need backends, even for ``numpy.ndarray`` inputs.
|
||
|
||
Advantanges of ``unumpy`` over other solutions
|
||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
||
``unumpy`` offers a number of advantanges over the approach of defining a new
|
||
protocol for every problem encountered: Whenever there is something requiring
|
||
an override, ``unumpy`` will be able to offer a unified API with very minor
|
||
changes. For example:
|
||
|
||
* ``ufunc`` objects can be overridden via their ``__call__``, ``reduce`` and
|
||
other methods.
|
||
* Other functions can be overridden in a similar fashion.
|
||
* ``np.asduckarray`` goes away, and becomes ``np.overridable.asarray`` with a
|
||
backend set.
|
||
* The same holds for array creation functions such as ``np.zeros``,
|
||
``np.empty`` and so on.
|
||
|
||
This also holds for the future: Making something overridable would require only
|
||
minor changes to ``unumpy``.
|
||
|
||
Another promise ``unumpy`` holds is one of default implementations. Default
|
||
implementations can be provided for any multimethod, in terms of others. This
|
||
allows one to override a large part of the NumPy API by defining only a small
|
||
part of it. This is to ease the creation of new duck-arrays, by providing
|
||
default implementations of many functions that can be easily expressed in
|
||
terms of others, as well as a repository of utility functions that help in the
|
||
implementation of duck-arrays that most duck-arrays would require. This would
|
||
allow us to avoid designing entire protocols, e.g., a protocol for stacking
|
||
and concatenating would be replaced by simply implementing ``stack`` and/or
|
||
``concatenate`` and then providing default implementations for everything else
|
||
in that class. The same applies for transposing, and many other functions for
|
||
which protocols haven't been proposed, such as ``isin`` in terms of ``in1d``,
|
||
``setdiff1d`` in terms of ``unique``, and so on.
|
||
|
||
It also allows one to override functions in a manner which
|
||
``__array_function__`` simply cannot, such as overriding ``np.einsum`` with the
|
||
version from the ``opt_einsum`` package, or Intel MKL overriding FFT, BLAS
|
||
or ``ufunc`` objects. They would define a backend with the appropriate
|
||
multimethods, and the user would select them via a ``with`` statement, or
|
||
registering them as a backend.
|
||
|
||
The last benefit is a clear way to coerce to a given backend (via the
|
||
``coerce`` keyword in ``ua.set_backend``), and a protocol
|
||
for coercing not only arrays, but also ``dtype`` objects and ``ufunc`` objects
|
||
with similar ones from other libraries. This is due to the existence of actual,
|
||
third party dtype packages, and their desire to blend into the NumPy ecosystem
|
||
(see [6]_). This is a separate issue compared to the C-level dtype redesign
|
||
proposed in [7]_, it's about allowing third-party dtype implementations to
|
||
work with NumPy, much like third-party array implementations. These can provide
|
||
features such as, for example, units, jagged arrays or other such features that
|
||
are outside the scope of NumPy.
|
||
|
||
Mixing NumPy and ``unumpy`` in the same file
|
||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
||
Normally, one would only want to import only one of ``unumpy`` or ``numpy``,
|
||
you would import it as ``np`` for familiarity. However, there may be situations
|
||
where one wishes to mix NumPy and the overrides, and there are a few ways to do
|
||
this, depending on the user's style::
|
||
|
||
from numpy import overridable as unp
|
||
import numpy as np
|
||
|
||
or::
|
||
|
||
import numpy as np
|
||
|
||
# Use unumpy via np.overridable
|
||
|
||
Duck-array coercion
|
||
~~~~~~~~~~~~~~~~~~~
|
||
|
||
There are inherent problems about returning objects that are not NumPy arrays
|
||
from ``numpy.array`` or ``numpy.asarray``, particularly in the context of C/C++
|
||
or Cython code that may get an object with a different memory layout than the
|
||
one it expects. However, we believe this problem may apply not only to these
|
||
two functions but all functions that return NumPy arrays. For this reason,
|
||
overrides are opt-in for the user, by using the submodule ``numpy.overridable``
|
||
rather than ``numpy``. NumPy will continue to work unaffected by anything in
|
||
``numpy.overridable``.
|
||
|
||
If the user wishes to obtain a NumPy array, there are two ways of doing it:
|
||
|
||
1. Use ``numpy.asarray`` (the non-overridable version).
|
||
2. Use ``numpy.overridable.asarray`` with the NumPy backend set and coercion
|
||
enabled
|
||
|
||
Aliases outside of the ``numpy.overridable`` namespace
|
||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||
|
||
All functionality in ``numpy.random``, ``numpy.linalg`` and ``numpy.fft``
|
||
will be aliased to their respective overridable versions inside
|
||
``numpy.overridable``. The reason for this is that there are alternative
|
||
implementations of RNGs (``mkl-random``), linear algebra routines (``eigen``,
|
||
``blis``) and FFT routines (``mkl-fft``, ``pyFFTW``) that need to operate on
|
||
``numpy.ndarray`` inputs, but still need the ability to switch behaviour.
|
||
|
||
This is different from monkeypatching in a few different ways:
|
||
|
||
* The caller-facing signature of the function is always the same,
|
||
so there is at least the loose sense of an API contract. Monkeypatching
|
||
does not provide this ability.
|
||
* There is the ability of locally switching the backend.
|
||
* It has been `suggested <http://numpy-discussion.10968.n7.nabble.com/NEP-31-Context-local-and-global-overrides-of-the-NumPy-API-tp47452p47472.html>`_
|
||
that the reason that 1.17 hasn't landed in the Anaconda defaults channel is
|
||
due to the incompatibility between monkeypatching and ``__array_function__``,
|
||
as monkeypatching would bypass the protocol completely.
|
||
* Statements of the form ``from numpy import x; x`` and ``np.x`` would have
|
||
different results depending on whether the import was made before or
|
||
after monkeypatching happened.
|
||
|
||
All this isn't possible at all with ``__array_function__`` or
|
||
``__array_ufunc__``.
|
||
|
||
It has been formally realised (at least in part) that a backend system is
|
||
needed for this, in the `NumPy roadmap <https://numpy.org/neps/roadmap.html#other-functionality>`_.
|
||
|
||
For ``numpy.random``, it's still necessary to make the C-API fit the one
|
||
proposed in `NEP-19 <https://numpy.org/neps/nep-0019-rng-policy.html>`_.
|
||
This is impossible for `mkl-random`, because then it would need to be
|
||
rewritten to fit that framework. The guarantees on stream
|
||
compatibility will be the same as before, but if there's a backend that affects
|
||
``numpy.random`` set, we make no guarantees about stream compatibility, and it
|
||
is up to the backend author to provide their own guarantees.
|
||
|
||
Providing a way for implicit dispatch
|
||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||
|
||
It has been suggested that the ability to dispatch methods which do not take
|
||
a dispatchable is needed, while guessing that backend from another dispatchable.
|
||
|
||
As a concrete example, consider the following:
|
||
|
||
.. code:: python
|
||
|
||
with unumpy.determine_backend(array_like, np.ndarray):
|
||
unumpy.arange(len(array_like))
|
||
|
||
While this does not exist yet in ``uarray``, it is trivial to add it. The need for
|
||
this kind of code exists because one might want to have an alternative for the
|
||
proposed ``*_like`` functions, or the ``like=`` keyword argument. The need for these
|
||
exists because there are functions in the NumPy API that do not take a dispatchable
|
||
argument, but there is still the need to select a backend based on a different
|
||
dispatchable.
|
||
|
||
The need for an opt-in module
|
||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||
|
||
The need for an opt-in module is realised because of a few reasons:
|
||
|
||
* There are parts of the API (like `numpy.asarray`) that simply cannot be
|
||
overridden due to incompatibility concerns with C/Cython extensions, however,
|
||
one may want to coerce to a duck-array using ``asarray`` with a backend set.
|
||
* There are possible issues around an implicit option and monkeypatching, such
|
||
as those mentioned above.
|
||
|
||
NEP 18 notes that this may require maintenance of two separate APIs. However,
|
||
this burden may be lessened by, for example, parametrizing all tests over
|
||
``numpy.overridable`` separately via a fixture. This also has the side-effect
|
||
of thoroughly testing it, unlike ``__array_function__``. We also feel that it
|
||
provides an oppurtunity to separate the NumPy API contract properly from the
|
||
implementation.
|
||
|
||
Benefits to end-users and mixing backends
|
||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||
|
||
Mixing backends is easy in ``uarray``, one only has to do:
|
||
|
||
.. code:: python
|
||
|
||
# Explicitly say which backends you want to mix
|
||
ua.register_backend(backend1)
|
||
ua.register_backend(backend2)
|
||
ua.register_backend(backend3)
|
||
|
||
# Freely use code that mixes backends here.
|
||
|
||
The benefits to end-users extend beyond just writing new code. Old code
|
||
(usually in the form of scripts) can be easily ported to different backends
|
||
by a simple import switch and a line adding the preferred backend. This way,
|
||
users may find it easier to port existing code to GPU or distributed computing.
|
||
|
||
Related Work
|
||
------------
|
||
|
||
Other override mechanisms
|
||
~~~~~~~~~~~~~~~~~~~~~~~~~
|
||
|
||
* NEP-18, the ``__array_function__`` protocol. [2]_
|
||
* NEP-13, the ``__array_ufunc__`` protocol. [3]_
|
||
* NEP-30, the ``__duck_array__`` protocol. [9]_
|
||
|
||
Existing NumPy-like array implementations
|
||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||
|
||
* Dask: https://dask.org/
|
||
* CuPy: https://cupy.chainer.org/
|
||
* PyData/Sparse: https://sparse.pydata.org/
|
||
* Xnd: https://xnd.readthedocs.io/
|
||
* Astropy's Quantity: https://docs.astropy.org/en/stable/units/
|
||
|
||
Existing and potential consumers of alternative arrays
|
||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||
|
||
* Dask: https://dask.org/
|
||
* scikit-learn: https://scikit-learn.org/
|
||
* xarray: https://xarray.pydata.org/
|
||
* TensorLy: http://tensorly.org/
|
||
|
||
Existing alternate dtype implementations
|
||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||
|
||
* ``ndtypes``: https://ndtypes.readthedocs.io/en/latest/
|
||
* Datashape: https://datashape.readthedocs.io
|
||
* Plum: https://plum-py.readthedocs.io/
|
||
|
||
Alternate implementations of parts of the NumPy API
|
||
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
||
|
||
* ``mkl_random``: https://github.com/IntelPython/mkl_random
|
||
* ``mkl_fft``: https://github.com/IntelPython/mkl_fft
|
||
* ``bottleneck``: https://github.com/pydata/bottleneck
|
||
* ``opt_einsum``: https://github.com/dgasmith/opt_einsum
|
||
|
||
Implementation
|
||
--------------
|
||
|
||
The implementation of this NEP will require the following steps:
|
||
|
||
* Implementation of ``uarray`` multimethods corresponding to the
|
||
NumPy API, including classes for overriding ``dtype``, ``ufunc``
|
||
and ``array`` objects, in the ``unumpy`` repository, which are usually
|
||
very easy to create.
|
||
* Moving backends from ``unumpy`` into the respective array libraries.
|
||
|
||
Maintenance can be eased by testing over ``{numpy, unumpy}`` via parameterized
|
||
tests. If a new argument is added to a method, the corresponding argument
|
||
extractor and replacer will need to be updated within ``unumpy``.
|
||
|
||
A lot of argument extractors can be re-used from the existing implementation
|
||
of the ``__array_function__`` protocol, and the replacers can be usually
|
||
re-used across many methods.
|
||
|
||
For the parts of the namespace which are going to be overridable by default,
|
||
the main method will need to be renamed and hidden behind a ``uarray`` multimethod.
|
||
|
||
Default implementations are usually seen in the documentation using the words
|
||
"equivalent to", and thus, are easily available.
|
||
|
||
``uarray`` Primer
|
||
~~~~~~~~~~~~~~~~~
|
||
|
||
**Note:** *This section will not attempt to go into too much detail about
|
||
uarray, that is the purpose of the uarray documentation.* [1]_
|
||
*However, the NumPy community will have input into the design of
|
||
uarray, via the issue tracker.*
|
||
|
||
``unumpy`` is the interface that defines a set of overridable functions
|
||
(multimethods) compatible with the numpy API. To do this, it uses the
|
||
``uarray`` library. ``uarray`` is a general purpose tool for creating
|
||
multimethods that dispatch to one of multiple different possible backend
|
||
implementations. In this sense, it is similar to the ``__array_function__``
|
||
protocol but with the key difference that the backend is explicitly installed
|
||
by the end-user and not coupled into the array type.
|
||
|
||
Decoupling the backend from the array type gives much more flexibility to
|
||
end-users and backend authors. For example, it is possible to:
|
||
|
||
* override functions not taking arrays as arguments
|
||
* create backends out of source from the array type
|
||
* install multiple backends for the same array type
|
||
|
||
This decoupling also means that ``uarray`` is not constrained to dispatching
|
||
over array-like types. The backend is free to inspect the entire set of
|
||
function arguments to determine if it can implement the function e.g. ``dtype``
|
||
parameter dispatching.
|
||
|
||
Defining backends
|
||
^^^^^^^^^^^^^^^^^
|
||
|
||
``uarray`` consists of two main protocols: ``__ua_convert__`` and
|
||
``__ua_function__``, called in that order, along with ``__ua_domain__``.
|
||
``__ua_convert__`` is for conversion and coercion. It has the signature
|
||
``(dispatchables, coerce)``, where ``dispatchables`` is an iterable of
|
||
``ua.Dispatchable`` objects and ``coerce`` is a boolean indicating whether or
|
||
not to force the conversion. ``ua.Dispatchable`` is a simple class consisting
|
||
of three simple values: ``type``, ``value``, and ``coercible``.
|
||
``__ua_convert__`` returns an iterable of the converted values, or
|
||
``NotImplemented`` in the case of failure.
|
||
|
||
``__ua_function__`` has the signature ``(func, args, kwargs)`` and defines
|
||
the actual implementation of the function. It receives the function and its
|
||
arguments. Returning ``NotImplemented`` will cause a move to the default
|
||
implementation of the function if one exists, and failing that, the next
|
||
backend.
|
||
|
||
Here is what will happen assuming a ``uarray`` multimethod is called:
|
||
|
||
1. We canonicalise the arguments so any arguments without a default
|
||
are placed in ``*args`` and those with one are placed in ``**kwargs``.
|
||
2. We check the list of backends.
|
||
|
||
a. If it is empty, we try the default implementation.
|
||
|
||
3. We check if the backend's ``__ua_convert__`` method exists. If it exists:
|
||
|
||
a. We pass it the output of the dispatcher,
|
||
which is an iterable of ``ua.Dispatchable`` objects.
|
||
b. We feed this output, along with the arguments,
|
||
to the argument replacer. ``NotImplemented`` means we move to 3
|
||
with the next backend.
|
||
c. We store the replaced arguments as the new arguments.
|
||
|
||
4. We feed the arguments into ``__ua_function__``, and return the output, and
|
||
exit if it isn't ``NotImplemented``.
|
||
5. If the default implementation exists, we try it with the current backend.
|
||
6. On failure, we move to 3 with the next backend. If there are no more
|
||
backends, we move to 7.
|
||
7. We raise a ``ua.BackendNotImplementedError``.
|
||
|
||
Defining overridable multimethods
|
||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||
|
||
To define an overridable function (a multimethod), one needs a few things:
|
||
|
||
1. A dispatcher that returns an iterable of ``ua.Dispatchable`` objects.
|
||
2. A reverse dispatcher that replaces dispatchable values with the supplied
|
||
ones.
|
||
3. A domain.
|
||
4. Optionally, a default implementation, which can be provided in terms of
|
||
other multimethods.
|
||
|
||
As an example, consider the following::
|
||
|
||
import uarray as ua
|
||
|
||
def full_argreplacer(args, kwargs, dispatchables):
|
||
def full(shape, fill_value, dtype=None, order='C'):
|
||
return (shape, fill_value), dict(
|
||
dtype=dispatchables[0],
|
||
order=order
|
||
)
|
||
|
||
return full(*args, **kwargs)
|
||
|
||
@ua.create_multimethod(full_argreplacer, domain="numpy")
|
||
def full(shape, fill_value, dtype=None, order='C'):
|
||
return (ua.Dispatchable(dtype, np.dtype),)
|
||
|
||
A large set of examples can be found in the ``unumpy`` repository, [8]_.
|
||
This simple act of overriding callables allows us to override:
|
||
|
||
* Methods
|
||
* Properties, via ``fget`` and ``fset``
|
||
* Entire objects, via ``__get__``.
|
||
|
||
Examples for NumPy
|
||
^^^^^^^^^^^^^^^^^^
|
||
|
||
A library that implements a NumPy-like API will use it in the following
|
||
manner (as an example)::
|
||
|
||
import numpy.overridable as unp
|
||
_ua_implementations = {}
|
||
|
||
__ua_domain__ = "numpy"
|
||
|
||
def __ua_function__(func, args, kwargs):
|
||
fn = _ua_implementations.get(func, None)
|
||
return fn(*args, **kwargs) if fn is not None else NotImplemented
|
||
|
||
def implements(ua_func):
|
||
def inner(func):
|
||
_ua_implementations[ua_func] = func
|
||
return func
|
||
|
||
return inner
|
||
|
||
@implements(unp.asarray)
|
||
def asarray(a, dtype=None, order=None):
|
||
# Code here
|
||
# Either this method or __ua_convert__ must
|
||
# return NotImplemented for unsupported types,
|
||
# Or they shouldn't be marked as dispatchable.
|
||
|
||
# Provides a default implementation for ones and zeros.
|
||
@implements(unp.full)
|
||
def full(shape, fill_value, dtype=None, order='C'):
|
||
# Code here
|
||
|
||
Alternatives
|
||
------------
|
||
|
||
The current alternative to this problem is a combination of NEP-18 [2]_,
|
||
NEP-13 [4]_ and NEP-30 [9]_ plus adding more protocols (not yet specified)
|
||
in addition to it. Even then, some parts of the NumPy API will remain
|
||
non-overridable, so it's a partial alternative.
|
||
|
||
The main alternative to vendoring ``unumpy`` is to simply move it into NumPy
|
||
completely and not distribute it as a separate package. This would also achieve
|
||
the proposed goals, however we prefer to keep it a separate package for now,
|
||
for reasons already stated above.
|
||
|
||
The third alternative is to move ``unumpy`` into the NumPy organisation and
|
||
develop it as a NumPy project. This will also achieve the said goals, and is
|
||
also a possibility that can be considered by this NEP. However, the act of
|
||
doing an extra ``pip install`` or ``conda install`` may discourage some users
|
||
from adopting this method.
|
||
|
||
An alternative to requiring opt-in is mainly to *not* override ``np.asarray``
|
||
and ``np.array``, and making the rest of the NumPy API surface overridable,
|
||
instead providing ``np.duckarray`` and ``np.asduckarray``
|
||
as duck-array friendly alternatives that used the respective overrides. However,
|
||
this has the downside of adding a minor overhead to NumPy calls.
|
||
|
||
Discussion
|
||
----------
|
||
|
||
* ``uarray`` blogpost: https://labs.quansight.org/blog/2019/07/uarray-update-api-changes-overhead-and-comparison-to-__array_function__/
|
||
* The discussion section of NEP-18: https://numpy.org/neps/nep-0018-array-function-protocol.html#discussion
|
||
* NEP-22: https://numpy.org/neps/nep-0022-ndarray-duck-typing-overview.html
|
||
* Dask issue #4462: https://github.com/dask/dask/issues/4462
|
||
* PR #13046: https://github.com/numpy/numpy/pull/13046
|
||
* Dask issue #4883: https://github.com/dask/dask/issues/4883
|
||
* Issue #13831: https://github.com/numpy/numpy/issues/13831
|
||
* Discussion PR 1: https://github.com/hameerabbasi/numpy/pull/3
|
||
* Discussion PR 2: https://github.com/hameerabbasi/numpy/pull/4
|
||
* Discussion PR 3: https://github.com/numpy/numpy/pull/14389
|
||
|
||
|
||
References and Footnotes
|
||
------------------------
|
||
|
||
.. [1] uarray, A general dispatch mechanism for Python: https://uarray.readthedocs.io
|
||
|
||
.. [2] NEP 18 — A dispatch mechanism for NumPy’s high level array functions: https://numpy.org/neps/nep-0018-array-function-protocol.html
|
||
|
||
.. [3] NEP 22 — Duck typing for NumPy arrays – high level overview: https://numpy.org/neps/nep-0022-ndarray-duck-typing-overview.html
|
||
|
||
.. [4] NEP 13 — A Mechanism for Overriding Ufuncs: https://numpy.org/neps/nep-0013-ufunc-overrides.html
|
||
|
||
.. [5] Reply to Adding to the non-dispatched implementation of NumPy methods: http://numpy-discussion.10968.n7.nabble.com/Adding-to-the-non-dispatched-implementation-of-NumPy-methods-tp46816p46874.html
|
||
|
||
.. [6] Custom Dtype/Units discussion: http://numpy-discussion.10968.n7.nabble.com/Custom-Dtype-Units-discussion-td43262.html
|
||
|
||
.. [7] The epic dtype cleanup plan: https://github.com/numpy/numpy/issues/2899
|
||
|
||
.. [8] unumpy: NumPy, but implementation-independent: https://unumpy.readthedocs.io
|
||
|
||
.. [9] NEP 30 — Duck Typing for NumPy Arrays - Implementation: https://www.numpy.org/neps/nep-0030-duck-array-protocol.html
|
||
|
||
.. [10] http://scipy.github.io/devdocs/fft.html#backend-control
|
||
|
||
|
||
Copyright
|
||
---------
|
||
|
||
This document has been placed in the public domain.
|