advanced/cast/eigen.rst

0001 Eigen
0002 #####
0003
0004 `Eigen <http://eigen.tuxfamily.org>`_ is C++ header-based library for dense and
0005 sparse linear algebra. Due to its popularity and widespread adoption, pybind11
0006 provides transparent conversion and limited mapping support between Eigen and
0007 Scientific Python linear algebra data types.
0008
0009 To enable the built-in Eigen support you must include the optional header file
0010 :file:`pybind11/eigen.h`.
0011
0012 Pass-by-value
0013 =============
0014
0015 When binding a function with ordinary Eigen dense object arguments (for
0016 example, ``Eigen::MatrixXd``), pybind11 will accept any input value that is
0017 already (or convertible to) a ``numpy.ndarray`` with dimensions compatible with
0018 the Eigen type, copy its values into a temporary Eigen variable of the
0019 appropriate type, then call the function with this temporary variable.
0020
0021 Sparse matrices are similarly copied to or from
0022 ``scipy.sparse.csr_matrix``/``scipy.sparse.csc_matrix`` objects.
0023
0024 Pass-by-reference
0025 =================
0026
0027 One major limitation of the above is that every data conversion implicitly
0028 involves a copy, which can be both expensive (for large matrices) and disallows
0029 binding functions that change their (Matrix) arguments.  Pybind11 allows you to
0030 work around this by using Eigen's ``Eigen::Ref<MatrixType>`` class much as you
0031 would when writing a function taking a generic type in Eigen itself (subject to
0032 some limitations discussed below).
0033
0034 When calling a bound function accepting a ``Eigen::Ref<const MatrixType>``
0035 type, pybind11 will attempt to avoid copying by using an ``Eigen::Map`` object
0036 that maps into the source ``numpy.ndarray`` data: this requires both that the
0037 data types are the same (e.g. ``dtype='float64'`` and ``MatrixType::Scalar`` is
0038 ``double``); and that the storage is layout compatible.  The latter limitation
0039 is discussed in detail in the section below, and requires careful
0040 consideration: by default, numpy matrices and Eigen matrices are *not* storage
0041 compatible.
0042
0043 If the numpy matrix cannot be used as is (either because its types differ, e.g.
0044 passing an array of integers to an Eigen parameter requiring doubles, or
0045 because the storage is incompatible), pybind11 makes a temporary copy and
0046 passes the copy instead.
0047
0048 When a bound function parameter is instead ``Eigen::Ref<MatrixType>`` (note the
0049 lack of ``const``), pybind11 will only allow the function to be called if it
0050 can be mapped *and* if the numpy array is writeable (that is
0051 ``a.flags.writeable`` is true).  Any access (including modification) made to
0052 the passed variable will be transparently carried out directly on the
0053 ``numpy.ndarray``.
0054
0055 This means you can write code such as the following and have it work as
0056 expected:
0057
0058 .. code-block:: cpp
0059
0060     void scale_by_2(Eigen::Ref<Eigen::VectorXd> v) {
0061         v *= 2;
0062     }
0063
0064 Note, however, that you will likely run into limitations due to numpy and
0065 Eigen's difference default storage order for data; see the below section on
0066 :ref:`storage_orders` for details on how to bind code that won't run into such
0067 limitations.
0068
0069 .. note::
0070
0071     Passing by reference is not supported for sparse types.
0072
0073 Returning values to Python
0074 ==========================
0075
0076 When returning an ordinary dense Eigen matrix type to numpy (e.g.
0077 ``Eigen::MatrixXd`` or ``Eigen::RowVectorXf``) pybind11 keeps the matrix and
0078 returns a numpy array that directly references the Eigen matrix: no copy of the
0079 data is performed.  The numpy array will have ``array.flags.owndata`` set to
0080 ``False`` to indicate that it does not own the data, and the lifetime of the
0081 stored Eigen matrix will be tied to the returned ``array``.
0082
0083 If you bind a function with a non-reference, ``const`` return type (e.g.
0084 ``const Eigen::MatrixXd``), the same thing happens except that pybind11 also
0085 sets the numpy array's ``writeable`` flag to false.
0086
0087 If you return an lvalue reference or pointer, the usual pybind11 rules apply,
0088 as dictated by the binding function's return value policy (see the
0089 documentation on :ref:`return_value_policies` for full details).  That means,
0090 without an explicit return value policy, lvalue references will be copied and
0091 pointers will be managed by pybind11.  In order to avoid copying, you should
0092 explicitly specify an appropriate return value policy, as in the following
0093 example:
0094
0095 .. code-block:: cpp
0096
0097     class MyClass {
0098         Eigen::MatrixXd big_mat = Eigen::MatrixXd::Zero(10000, 10000);
0099     public:
0100         Eigen::MatrixXd &getMatrix() { return big_mat; }
0101         const Eigen::MatrixXd &viewMatrix() { return big_mat; }
0102     };
0103
0104     // Later, in binding code:
0105     py::class_<MyClass>(m, "MyClass")
0106         .def(py::init<>())
0107         .def("copy_matrix", &MyClass::getMatrix) // Makes a copy!
0108         .def("get_matrix", &MyClass::getMatrix, py::return_value_policy::reference_internal)
0109         .def("view_matrix", &MyClass::viewMatrix, py::return_value_policy::reference_internal)
0110         ;
0111
0112 .. code-block:: python
0113
0114     a = MyClass()
0115     m = a.get_matrix()  # flags.writeable = True,  flags.owndata = False
0116     v = a.view_matrix()  # flags.writeable = False, flags.owndata = False
0117     c = a.copy_matrix()  # flags.writeable = True,  flags.owndata = True
0118     # m[5,6] and v[5,6] refer to the same element, c[5,6] does not.
0119
0120 Note in this example that ``py::return_value_policy::reference_internal`` is
0121 used to tie the life of the MyClass object to the life of the returned arrays.
0122
0123 You may also return an ``Eigen::Ref``, ``Eigen::Map`` or other map-like Eigen
0124 object (for example, the return value of ``matrix.block()`` and related
0125 methods) that map into a dense Eigen type.  When doing so, the default
0126 behaviour of pybind11 is to simply reference the returned data: you must take
0127 care to ensure that this data remains valid!  You may ask pybind11 to
0128 explicitly *copy* such a return value by using the
0129 ``py::return_value_policy::copy`` policy when binding the function.  You may
0130 also use ``py::return_value_policy::reference_internal`` or a
0131 ``py::keep_alive`` to ensure the data stays valid as long as the returned numpy
0132 array does.
0133
0134 When returning such a reference of map, pybind11 additionally respects the
0135 readonly-status of the returned value, marking the numpy array as non-writeable
0136 if the reference or map was itself read-only.
0137
0138 .. note::
0139
0140     Sparse types are always copied when returned.
0141
0142 .. _storage_orders:
0143
0144 Storage orders
0145 ==============
0146
0147 Passing arguments via ``Eigen::Ref`` has some limitations that you must be
0148 aware of in order to effectively pass matrices by reference.  First and
0149 foremost is that the default ``Eigen::Ref<MatrixType>`` class requires
0150 contiguous storage along columns (for column-major types, the default in Eigen)
0151 or rows if ``MatrixType`` is specifically an ``Eigen::RowMajor`` storage type.
0152 The former, Eigen's default, is incompatible with ``numpy``'s default row-major
0153 storage, and so you will not be able to pass numpy arrays to Eigen by reference
0154 without making one of two changes.
0155
0156 (Note that this does not apply to vectors (or column or row matrices): for such
0157 types the "row-major" and "column-major" distinction is meaningless).
0158
0159 The first approach is to change the use of ``Eigen::Ref<MatrixType>`` to the
0160 more general ``Eigen::Ref<MatrixType, 0, Eigen::Stride<Eigen::Dynamic,
0161 Eigen::Dynamic>>`` (or similar type with a fully dynamic stride type in the
0162 third template argument).  Since this is a rather cumbersome type, pybind11
0163 provides a ``py::EigenDRef<MatrixType>`` type alias for your convenience (along
0164 with EigenDMap for the equivalent Map, and EigenDStride for just the stride
0165 type).
0166
0167 This type allows Eigen to map into any arbitrary storage order.  This is not
0168 the default in Eigen for performance reasons: contiguous storage allows
0169 vectorization that cannot be done when storage is not known to be contiguous at
0170 compile time.  The default ``Eigen::Ref`` stride type allows non-contiguous
0171 storage along the outer dimension (that is, the rows of a column-major matrix
0172 or columns of a row-major matrix), but not along the inner dimension.
0173
0174 This type, however, has the added benefit of also being able to map numpy array
0175 slices.  For example, the following (contrived) example uses Eigen with a numpy
0176 slice to multiply by 2 all coefficients that are both on even rows (0, 2, 4,
0177 ...) and in columns 2, 5, or 8:
0178
0179 .. code-block:: cpp
0180
0181     m.def("scale", [](py::EigenDRef<Eigen::MatrixXd> m, double c) { m *= c; });
0182
0183 .. code-block:: python
0184
0185     # a = np.array(...)
0186     scale_by_2(myarray[0::2, 2:9:3])
0187
0188 The second approach to avoid copying is more intrusive: rearranging the
0189 underlying data types to not run into the non-contiguous storage problem in the
0190 first place.  In particular, that means using matrices with ``Eigen::RowMajor``
0191 storage, where appropriate, such as:
0192
0193 .. code-block:: cpp
0194
0195     using RowMatrixXd = Eigen::Matrix<double, Eigen::Dynamic, Eigen::Dynamic, Eigen::RowMajor>;
0196     // Use RowMatrixXd instead of MatrixXd
0197
0198 Now bound functions accepting ``Eigen::Ref<RowMatrixXd>`` arguments will be
0199 callable with numpy's (default) arrays without involving a copying.
0200
0201 You can, alternatively, change the storage order that numpy arrays use by
0202 adding the ``order='F'`` option when creating an array:
0203
0204 .. code-block:: python
0205
0206     myarray = np.array(source, order="F")
0207
0208 Such an object will be passable to a bound function accepting an
0209 ``Eigen::Ref<MatrixXd>`` (or similar column-major Eigen type).
0210
0211 One major caveat with this approach, however, is that it is not entirely as
0212 easy as simply flipping all Eigen or numpy usage from one to the other: some
0213 operations may alter the storage order of a numpy array.  For example, ``a2 =
0214 array.transpose()`` results in ``a2`` being a view of ``array`` that references
0215 the same data, but in the opposite storage order!
0216
0217 While this approach allows fully optimized vectorized calculations in Eigen, it
0218 cannot be used with array slices, unlike the first approach.
0219
0220 When *returning* a matrix to Python (either a regular matrix, a reference via
0221 ``Eigen::Ref<>``, or a map/block into a matrix), no special storage
0222 consideration is required: the created numpy array will have the required
0223 stride that allows numpy to properly interpret the array, whatever its storage
0224 order.
0225
0226 Failing rather than copying
0227 ===========================
0228
0229 The default behaviour when binding ``Eigen::Ref<const MatrixType>`` Eigen
0230 references is to copy matrix values when passed a numpy array that does not
0231 conform to the element type of ``MatrixType`` or does not have a compatible
0232 stride layout.  If you want to explicitly avoid copying in such a case, you
0233 should bind arguments using the ``py::arg().noconvert()`` annotation (as
0234 described in the :ref:`nonconverting_arguments` documentation).
0235
0236 The following example shows an example of arguments that don't allow data
0237 copying to take place:
0238
0239 .. code-block:: cpp
0240
0241     // The method and function to be bound:
0242     class MyClass {
0243         // ...
0244         double some_method(const Eigen::Ref<const MatrixXd> &matrix) { /* ... */ }
0245     };
0246     float some_function(const Eigen::Ref<const MatrixXf> &big,
0247                         const Eigen::Ref<const MatrixXf> &small) {
0248         // ...
0249     }
0250
0251     // The associated binding code:
0252     using namespace pybind11::literals; // for "arg"_a
0253     py::class_<MyClass>(m, "MyClass")
0254         // ... other class definitions
0255         .def("some_method", &MyClass::some_method, py::arg().noconvert());
0256
0257     m.def("some_function", &some_function,
0258         "big"_a.noconvert(), // <- Don't allow copying for this arg
0259         "small"_a            // <- This one can be copied if needed
0260     );
0261
0262 With the above binding code, attempting to call the the ``some_method(m)``
0263 method on a ``MyClass`` object, or attempting to call ``some_function(m, m2)``
0264 will raise a ``RuntimeError`` rather than making a temporary copy of the array.
0265 It will, however, allow the ``m2`` argument to be copied into a temporary if
0266 necessary.
0267
0268 Note that explicitly specifying ``.noconvert()`` is not required for *mutable*
0269 Eigen references (e.g. ``Eigen::Ref<MatrixXd>`` without ``const`` on the
0270 ``MatrixXd``): mutable references will never be called with a temporary copy.
0271
0272 Vectors versus column/row matrices
0273 ==================================
0274
0275 Eigen and numpy have fundamentally different notions of a vector.  In Eigen, a
0276 vector is simply a matrix with the number of columns or rows set to 1 at
0277 compile time (for a column vector or row vector, respectively).  NumPy, in
0278 contrast, has comparable 2-dimensional 1xN and Nx1 arrays, but *also* has
0279 1-dimensional arrays of size N.
0280
0281 When passing a 2-dimensional 1xN or Nx1 array to Eigen, the Eigen type must
0282 have matching dimensions: That is, you cannot pass a 2-dimensional Nx1 numpy
0283 array to an Eigen value expecting a row vector, or a 1xN numpy array as a
0284 column vector argument.
0285
0286 On the other hand, pybind11 allows you to pass 1-dimensional arrays of length N
0287 as Eigen parameters.  If the Eigen type can hold a column vector of length N it
0288 will be passed as such a column vector.  If not, but the Eigen type constraints
0289 will accept a row vector, it will be passed as a row vector.  (The column
0290 vector takes precedence when both are supported, for example, when passing a
0291 1D numpy array to a MatrixXd argument).  Note that the type need not be
0292 explicitly a vector: it is permitted to pass a 1D numpy array of size 5 to an
0293 Eigen ``Matrix<double, Dynamic, 5>``: you would end up with a 1x5 Eigen matrix.
0294 Passing the same to an ``Eigen::MatrixXd`` would result in a 5x1 Eigen matrix.
0295
0296 When returning an Eigen vector to numpy, the conversion is ambiguous: a row
0297 vector of length 4 could be returned as either a 1D array of length 4, or as a
0298 2D array of size 1x4.  When encountering such a situation, pybind11 compromises
0299 by considering the returned Eigen type: if it is a compile-time vector--that
0300 is, the type has either the number of rows or columns set to 1 at compile
0301 time--pybind11 converts to a 1D numpy array when returning the value.  For
0302 instances that are a vector only at run-time (e.g. ``MatrixXd``,
0303 ``Matrix<float, Dynamic, 4>``), pybind11 returns the vector as a 2D array to
0304 numpy.  If this isn't want you want, you can use ``array.reshape(...)`` to get
0305 a view of the same data in the desired dimensions.
0306
0307 .. seealso::
0308
0309     The file :file:`tests/test_eigen.cpp` contains a complete example that
0310     shows how to pass Eigen sparse and dense data types in more detail.