tmp
/
pip-install-ghxuqwgs
/numpy_78e94bf2b6094bf9a1f3d92042f9bf46
/build
/lib.linux-x86_64-cpython-310
/numpy
/doc
/subclassing.py
""" | |
============================= | |
Subclassing ndarray in python | |
============================= | |
Credits | |
------- | |
This page is based with thanks on the wiki page on subclassing by Pierre | |
Gerard-Marchant - http://www.scipy.org/Subclasses. | |
Introduction | |
------------ | |
Subclassing ndarray is relatively simple, but it has some complications | |
compared to other Python objects. On this page we explain the machinery | |
that allows you to subclass ndarray, and the implications for | |
implementing a subclass. | |
ndarrays and object creation | |
============================ | |
Subclassing ndarray is complicated by the fact that new instances of | |
ndarray classes can come about in three different ways. These are: | |
#. Explicit constructor call - as in ``MySubClass(params)``. This is | |
the usual route to Python instance creation. | |
#. View casting - casting an existing ndarray as a given subclass | |
#. New from template - creating a new instance from a template | |
instance. Examples include returning slices from a subclassed array, | |
creating return types from ufuncs, and copying arrays. See | |
:ref:`new-from-template` for more details | |
The last two are characteristics of ndarrays - in order to support | |
things like array slicing. The complications of subclassing ndarray are | |
due to the mechanisms numpy has to support these latter two routes of | |
instance creation. | |
.. _view-casting: | |
View casting | |
------------ | |
*View casting* is the standard ndarray mechanism by which you take an | |
ndarray of any subclass, and return a view of the array as another | |
(specified) subclass: | |
import numpy as np | |
# create a completely useless ndarray subclass | |
class C(np.ndarray): pass | |
# create a standard ndarray | |
arr = np.zeros((3,)) | |
# take a view of it, as our useless subclass | |
c_arr = arr.view(C) | |
type(c_arr) | |
<class 'C'> | |
.. _new-from-template: | |
Creating new from template | |
-------------------------- | |
New instances of an ndarray subclass can also come about by a very | |
similar mechanism to :ref:`view-casting`, when numpy finds it needs to | |
create a new instance from a template instance. The most obvious place | |
this has to happen is when you are taking slices of subclassed arrays. | |
For example: | |
v = c_arr[1:] | |
type(v) # the view is of type 'C' | |
<class 'C'> | |
v is c_arr # but it's a new instance | |
False | |
The slice is a *view* onto the original ``c_arr`` data. So, when we | |
take a view from the ndarray, we return a new ndarray, of the same | |
class, that points to the data in the original. | |
There are other points in the use of ndarrays where we need such views, | |
such as copying arrays (``c_arr.copy()``), creating ufunc output arrays | |
(see also :ref:`array-wrap`), and reducing methods (like | |
``c_arr.mean()``. | |
Relationship of view casting and new-from-template | |
-------------------------------------------------- | |
These paths both use the same machinery. We make the distinction here, | |
because they result in different input to your methods. Specifically, | |
:ref:`view-casting` means you have created a new instance of your array | |
type from any potential subclass of ndarray. :ref:`new-from-template` | |
means you have created a new instance of your class from a pre-existing | |
instance, allowing you - for example - to copy across attributes that | |
are particular to your subclass. | |
Implications for subclassing | |
---------------------------- | |
If we subclass ndarray, we need to deal not only with explicit | |
construction of our array type, but also :ref:`view-casting` or | |
:ref:`new-from-template`. Numpy has the machinery to do this, and this | |
machinery that makes subclassing slightly non-standard. | |
There are two aspects to the machinery that ndarray uses to support | |
views and new-from-template in subclasses. | |
The first is the use of the ``ndarray.__new__`` method for the main work | |
of object initialization, rather then the more usual ``__init__`` | |
method. The second is the use of the ``__array_finalize__`` method to | |
allow subclasses to clean up after the creation of views and new | |
instances from templates. | |
A brief Python primer on ``__new__`` and ``__init__`` | |
===================================================== | |
``__new__`` is a standard Python method, and, if present, is called | |
before ``__init__`` when we create a class instance. See the `python | |
__new__ documentation | |
<http://docs.python.org/reference/datamodel.html#object.__new__>`_ for more detail. | |
For example, consider the following Python code: | |
.. testcode:: | |
class C(object): | |
def __new__(cls, *args): | |
print 'Cls in __new__:', cls | |
print 'Args in __new__:', args | |
return object.__new__(cls, *args) | |
def __init__(self, *args): | |
print 'type(self) in __init__:', type(self) | |
print 'Args in __init__:', args | |
meaning that we get: | |
c = C('hello') | |
Cls in __new__: <class 'C'> | |
Args in __new__: ('hello',) | |
type(self) in __init__: <class 'C'> | |
Args in __init__: ('hello',) | |
When we call ``C('hello')``, the ``__new__`` method gets its own class | |
as first argument, and the passed argument, which is the string | |
``'hello'``. After python calls ``__new__``, it usually (see below) | |
calls our ``__init__`` method, with the output of ``__new__`` as the | |
first argument (now a class instance), and the passed arguments | |
following. | |
As you can see, the object can be initialized in the ``__new__`` | |
method or the ``__init__`` method, or both, and in fact ndarray does | |
not have an ``__init__`` method, because all the initialization is | |
done in the ``__new__`` method. | |
Why use ``__new__`` rather than just the usual ``__init__``? Because | |
in some cases, as for ndarray, we want to be able to return an object | |
of some other class. Consider the following: | |
.. testcode:: | |
class D(C): | |
def __new__(cls, *args): | |
print 'D cls is:', cls | |
print 'D args in __new__:', args | |
return C.__new__(C, *args) | |
def __init__(self, *args): | |
# we never get here | |
print 'In D __init__' | |
meaning that: | |
obj = D('hello') | |
D cls is: <class 'D'> | |
D args in __new__: ('hello',) | |
Cls in __new__: <class 'C'> | |
Args in __new__: ('hello',) | |
type(obj) | |
<class 'C'> | |
The definition of ``C`` is the same as before, but for ``D``, the | |
``__new__`` method returns an instance of class ``C`` rather than | |
``D``. Note that the ``__init__`` method of ``D`` does not get | |
called. In general, when the ``__new__`` method returns an object of | |
class other than the class in which it is defined, the ``__init__`` | |
method of that class is not called. | |
This is how subclasses of the ndarray class are able to return views | |
that preserve the class type. When taking a view, the standard | |
ndarray machinery creates the new ndarray object with something | |
like:: | |
obj = ndarray.__new__(subtype, shape, ... | |
where ``subdtype`` is the subclass. Thus the returned view is of the | |
same class as the subclass, rather than being of class ``ndarray``. | |
That solves the problem of returning views of the same type, but now | |
we have a new problem. The machinery of ndarray can set the class | |
this way, in its standard methods for taking views, but the ndarray | |
``__new__`` method knows nothing of what we have done in our own | |
``__new__`` method in order to set attributes, and so on. (Aside - | |
why not call ``obj = subdtype.__new__(...`` then? Because we may not | |
have a ``__new__`` method with the same call signature). | |
The role of ``__array_finalize__`` | |
================================== | |
``__array_finalize__`` is the mechanism that numpy provides to allow | |
subclasses to handle the various ways that new instances get created. | |
Remember that subclass instances can come about in these three ways: | |
#. explicit constructor call (``obj = MySubClass(params)``). This will | |
call the usual sequence of ``MySubClass.__new__`` then (if it exists) | |
``MySubClass.__init__``. | |
#. :ref:`view-casting` | |
#. :ref:`new-from-template` | |
Our ``MySubClass.__new__`` method only gets called in the case of the | |
explicit constructor call, so we can't rely on ``MySubClass.__new__`` or | |
``MySubClass.__init__`` to deal with the view casting and | |
new-from-template. It turns out that ``MySubClass.__array_finalize__`` | |
*does* get called for all three methods of object creation, so this is | |
where our object creation housekeeping usually goes. | |
* For the explicit constructor call, our subclass will need to create a | |
new ndarray instance of its own class. In practice this means that | |
we, the authors of the code, will need to make a call to | |
``ndarray.__new__(MySubClass,...)``, or do view casting of an existing | |
array (see below) | |
* For view casting and new-from-template, the equivalent of | |
``ndarray.__new__(MySubClass,...`` is called, at the C level. | |
The arguments that ``__array_finalize__`` recieves differ for the three | |
methods of instance creation above. | |
The following code allows us to look at the call sequences and arguments: | |
.. testcode:: | |
import numpy as np | |
class C(np.ndarray): | |
def __new__(cls, *args, **kwargs): | |
print 'In __new__ with class %s' % cls | |
return np.ndarray.__new__(cls, *args, **kwargs) | |
def __init__(self, *args, **kwargs): | |
# in practice you probably will not need or want an __init__ | |
# method for your subclass | |
print 'In __init__ with class %s' % self.__class__ | |
def __array_finalize__(self, obj): | |
print 'In array_finalize:' | |
print ' self type is %s' % type(self) | |
print ' obj type is %s' % type(obj) | |
Now: | |
# Explicit constructor | |
c = C((10,)) | |
In __new__ with class <class 'C'> | |
In array_finalize: | |
self type is <class 'C'> | |
obj type is <type 'NoneType'> | |
In __init__ with class <class 'C'> | |
# View casting | |
a = np.arange(10) | |
cast_a = a.view(C) | |
In array_finalize: | |
self type is <class 'C'> | |
obj type is <type 'numpy.ndarray'> | |
# Slicing (example of new-from-template) | |
cv = c[:1] | |
In array_finalize: | |
self type is <class 'C'> | |
obj type is <class 'C'> | |
The signature of ``__array_finalize__`` is:: | |
def __array_finalize__(self, obj): | |
``ndarray.__new__`` passes ``__array_finalize__`` the new object, of our | |
own class (``self``) as well as the object from which the view has been | |
taken (``obj``). As you can see from the output above, the ``self`` is | |
always a newly created instance of our subclass, and the type of ``obj`` | |
differs for the three instance creation methods: | |
* When called from the explicit constructor, ``obj`` is ``None`` | |
* When called from view casting, ``obj`` can be an instance of any | |
subclass of ndarray, including our own. | |
* When called in new-from-template, ``obj`` is another instance of our | |
own subclass, that we might use to update the new ``self`` instance. | |
Because ``__array_finalize__`` is the only method that always sees new | |
instances being created, it is the sensible place to fill in instance | |
defaults for new object attributes, among other tasks. | |
This may be clearer with an example. | |
Simple example - adding an extra attribute to ndarray | |
----------------------------------------------------- | |
.. testcode:: | |
import numpy as np | |
class InfoArray(np.ndarray): | |
def __new__(subtype, shape, dtype=float, buffer=None, offset=0, | |
strides=None, order=None, info=None): | |
# Create the ndarray instance of our type, given the usual | |
# ndarray input arguments. This will call the standard | |
# ndarray constructor, but return an object of our type. | |
# It also triggers a call to InfoArray.__array_finalize__ | |
obj = np.ndarray.__new__(subtype, shape, dtype, buffer, offset, strides, | |
order) | |
# set the new 'info' attribute to the value passed | |
obj.info = info | |
# Finally, we must return the newly created object: | |
return obj | |
def __array_finalize__(self, obj): | |
# ``self`` is a new object resulting from | |
# ndarray.__new__(InfoArray, ...), therefore it only has | |
# attributes that the ndarray.__new__ constructor gave it - | |
# i.e. those of a standard ndarray. | |
# | |
# We could have got to the ndarray.__new__ call in 3 ways: | |
# From an explicit constructor - e.g. InfoArray(): | |
# obj is None | |
# (we're in the middle of the InfoArray.__new__ | |
# constructor, and self.info will be set when we return to | |
# InfoArray.__new__) | |
if obj is None: return | |
# From view casting - e.g arr.view(InfoArray): | |
# obj is arr | |
# (type(obj) can be InfoArray) | |
# From new-from-template - e.g infoarr[:3] | |
# type(obj) is InfoArray | |
# | |
# Note that it is here, rather than in the __new__ method, | |
# that we set the default value for 'info', because this | |
# method sees all creation of default objects - with the | |
# InfoArray.__new__ constructor, but also with | |
# arr.view(InfoArray). | |
self.info = getattr(obj, 'info', None) | |
# We do not need to return anything | |
Using the object looks like this: | |
>>> obj = InfoArray(shape=(3,)) # explicit constructor | |
>>> type(obj) | |
<class 'InfoArray'> | |
>>> obj.info is None | |
True | |
>>> obj = InfoArray(shape=(3,), info='information') | |
>>> obj.info | |
'information' | |
>>> v = obj[1:] # new-from-template - here - slicing | |
>>> type(v) | |
<class 'InfoArray'> | |
>>> v.info | |
'information' | |
>>> arr = np.arange(10) | |
>>> cast_arr = arr.view(InfoArray) # view casting | |
>>> type(cast_arr) | |
<class 'InfoArray'> | |
>>> cast_arr.info is None | |
True | |
This class isn't very useful, because it has the same constructor as the | |
bare ndarray object, including passing in buffers and shapes and so on. | |
We would probably prefer the constructor to be able to take an already | |
formed ndarray from the usual numpy calls to ``np.array`` and return an | |
object. | |
Slightly more realistic example - attribute added to existing array | |
------------------------------------------------------------------- | |
Here is a class that takes a standard ndarray that already exists, casts | |
as our type, and adds an extra attribute. | |
.. testcode:: | |
import numpy as np | |
class RealisticInfoArray(np.ndarray): | |
def __new__(cls, input_array, info=None): | |
# Input array is an already formed ndarray instance | |
# We first cast to be our class type | |
obj = np.asarray(input_array).view(cls) | |
# add the new attribute to the created instance | |
obj.info = info | |
# Finally, we must return the newly created object: | |
return obj | |
def __array_finalize__(self, obj): | |
# see InfoArray.__array_finalize__ for comments | |
if obj is None: return | |
self.info = getattr(obj, 'info', None) | |
So: | |
>>> arr = np.arange(5) | |
>>> obj = RealisticInfoArray(arr, info='information') | |
>>> type(obj) | |
<class 'RealisticInfoArray'> | |
>>> obj.info | |
'information' | |
>>> v = obj[1:] | |
>>> type(v) | |
<class 'RealisticInfoArray'> | |
>>> v.info | |
'information' | |
.. _array-wrap: | |
``__array_wrap__`` for ufuncs | |
------------------------------------------------------- | |
``__array_wrap__`` gets called at the end of numpy ufuncs and other numpy | |
functions, to allow a subclass to set the type of the return value | |
and update attributes and metadata. Let's show how this works with an example. | |
First we make the same subclass as above, but with a different name and | |
some print statements: | |
.. testcode:: | |
import numpy as np | |
class MySubClass(np.ndarray): | |
def __new__(cls, input_array, info=None): | |
obj = np.asarray(input_array).view(cls) | |
obj.info = info | |
return obj | |
def __array_finalize__(self, obj): | |
print 'In __array_finalize__:' | |
print ' self is %s' % repr(self) | |
print ' obj is %s' % repr(obj) | |
if obj is None: return | |
self.info = getattr(obj, 'info', None) | |
def __array_wrap__(self, out_arr, context=None): | |
print 'In __array_wrap__:' | |
print ' self is %s' % repr(self) | |
print ' arr is %s' % repr(out_arr) | |
# then just call the parent | |
return np.ndarray.__array_wrap__(self, out_arr, context) | |
We run a ufunc on an instance of our new array: | |
obj = MySubClass(np.arange(5), info='spam') | |
In __array_finalize__: | |
self is MySubClass([0, 1, 2, 3, 4]) | |
obj is array([0, 1, 2, 3, 4]) | |
arr2 = np.arange(5)+1 | |
ret = np.add(arr2, obj) | |
In __array_wrap__: | |
self is MySubClass([0, 1, 2, 3, 4]) | |
arr is array([1, 3, 5, 7, 9]) | |
In __array_finalize__: | |
self is MySubClass([1, 3, 5, 7, 9]) | |
obj is MySubClass([0, 1, 2, 3, 4]) | |
ret | |
MySubClass([1, 3, 5, 7, 9]) | |
ret.info | |
'spam' | |
Note that the ufunc (``np.add``) has called the ``__array_wrap__`` method of the | |
input with the highest ``__array_priority__`` value, in this case | |
``MySubClass.__array_wrap__``, with arguments ``self`` as ``obj``, and | |
``out_arr`` as the (ndarray) result of the addition. In turn, the | |
default ``__array_wrap__`` (``ndarray.__array_wrap__``) has cast the | |
result to class ``MySubClass``, and called ``__array_finalize__`` - | |
hence the copying of the ``info`` attribute. This has all happened at the C level. | |
But, we could do anything we wanted: | |
.. testcode:: | |
class SillySubClass(np.ndarray): | |
def __array_wrap__(self, arr, context=None): | |
return 'I lost your data' | |
arr1 = np.arange(5) | |
obj = arr1.view(SillySubClass) | |
arr2 = np.arange(5) | |
ret = np.multiply(obj, arr2) | |
ret | |
'I lost your data' | |
So, by defining a specific ``__array_wrap__`` method for our subclass, | |
we can tweak the output from ufuncs. The ``__array_wrap__`` method | |
requires ``self``, then an argument - which is the result of the ufunc - | |
and an optional parameter *context*. This parameter is returned by some | |
ufuncs as a 3-element tuple: (name of the ufunc, argument of the ufunc, | |
domain of the ufunc). ``__array_wrap__`` should return an instance of | |
its containing class. See the masked array subclass for an | |
implementation. | |
In addition to ``__array_wrap__``, which is called on the way out of the | |
ufunc, there is also an ``__array_prepare__`` method which is called on | |
the way into the ufunc, after the output arrays are created but before any | |
computation has been performed. The default implementation does nothing | |
but pass through the array. ``__array_prepare__`` should not attempt to | |
access the array data or resize the array, it is intended for setting the | |
output array type, updating attributes and metadata, and performing any | |
checks based on the input that may be desired before computation begins. | |
Like ``__array_wrap__``, ``__array_prepare__`` must return an ndarray or | |
subclass thereof or raise an error. | |
Extra gotchas - custom ``__del__`` methods and ndarray.base | |
----------------------------------------------------------- | |
One of the problems that ndarray solves is keeping track of memory | |
ownership of ndarrays and their views. Consider the case where we have | |
created an ndarray, ``arr`` and have taken a slice with ``v = arr[1:]``. | |
The two objects are looking at the same memory. Numpy keeps track of | |
where the data came from for a particular array or view, with the | |
``base`` attribute: | |
# A normal ndarray, that owns its own data | |
arr = np.zeros((4,)) | |
# In this case, base is None | |
arr.base is None | |
True | |
# We take a view | |
v1 = arr[1:] | |
# base now points to the array that it derived from | |
v1.base is arr | |
True | |
# Take a view of a view | |
v2 = v1[1:] | |
# base points to the view it derived from | |
v2.base is v1 | |
True | |
In general, if the array owns its own memory, as for ``arr`` in this | |
case, then ``arr.base`` will be None - there are some exceptions to this | |
- see the numpy book for more details. | |
The ``base`` attribute is useful in being able to tell whether we have | |
a view or the original array. This in turn can be useful if we need | |
to know whether or not to do some specific cleanup when the subclassed | |
array is deleted. For example, we may only want to do the cleanup if | |
the original array is deleted, but not the views. For an example of | |
how this can work, have a look at the ``memmap`` class in | |
``numpy.core``. | |
""" | |
from __future__ import division, absolute_import, print_function | |