Example: An Interval Type

In this example, we will extend the Numba frontend to add support for a user-defined class that it does not internally support. This will allow:

  • Passing an instance of the class to a Numba function

  • Accessing attributes of the class in a Numba function

  • Constructing and returning a new instance of the class from a Numba function

(all the above in nopython mode)

We will mix APIs from the high-level extension API and the low-level extension API, depending on what is available for a given task.

The starting point for our example is the following pure Python class:

class Interval(object):
    """
    A half-open interval on the real number line.
    """
    def __init__(self, lo, hi):
        self.lo = lo
        self.hi = hi

    def __repr__(self):
        return 'Interval(%f, %f)' % (self.lo, self.hi)

    @property
    def width(self):
        return self.hi - self.lo

Extending the typing layer

Creating a new Numba type

As the Interval class is not known to Numba, we must create a new Numba type to represent instances of it. Numba does not deal with Python types directly: it has its own type system that allows a different level of granularity as well as various meta-information not available with regular Python types.

We first create a type class IntervalType and, since we don’t need the type to be parametric, we instantiate a single type instance interval_type:

from numba import types

class IntervalType(types.Type):
    def __init__(self):
        super(IntervalType, self).__init__(name='Interval')

interval_type = IntervalType()

Type inference for Python values

In itself, creating a Numba type doesn’t do anything. We must teach Numba how to infer some Python values as instances of that type. In this example, it is trivial: any instance of the Interval class should be treated as belonging to the type interval_type:

from numba.extending import typeof_impl

@typeof_impl.register(Interval)
def typeof_index(val, c):
    return interval_type

Function arguments and global values will thusly be recognized as belonging to interval_type whenever they are instances of Interval.

Type inference for Python annotations

While typeof is used to infer the Numba type of Python objects, as_numba_type is used to infer the Numba type of Python types. For simple cases, we can simply register that the Python type Interval corresponds with the Numba type interval_type:

from numba.extending import as_numba_type

as_numba_type.register(Interval, interval_type)

Note that as_numba_type is only used to infer types from type annotations at compile time. The typeof registry above is used to infer the type of objects at runtime.

Type inference for operations

We want to be able to construct interval objects from Numba functions, so we must teach Numba to recognize the two-argument Interval(lo, hi) constructor. The arguments should be floating-point numbers:

from numba.extending import type_callable

@type_callable(Interval)
def type_interval(context):
    def typer(lo, hi):
        if isinstance(lo, types.Float) and isinstance(hi, types.Float):
            return interval_type
    return typer

The type_callable() decorator specifies that the decorated function should be invoked when running type inference for the given callable object (here the Interval class itself). The decorated function must simply return a typer function that will be called with the argument types. The reason for this seemingly convoluted setup is for the typer function to have exactly the same signature as the typed callable. This allows handling keyword arguments correctly.

The context argument received by the decorated function is useful in more sophisticated cases where computing the callable’s return type requires resolving other types.

Extending the lowering layer

We have finished teaching Numba about our type inference additions. We must now teach Numba how to actually generate code and data for the new operations.

Defining the data model for native intervals

As a general rule, nopython mode does not work on Python objects as they are generated by the CPython interpreter. The representations used by the interpreter are far too inefficient for fast native code. Each type supported in nopython mode therefore has to define a tailored native representation, also called a data model.

A common case of data model is an immutable struct-like data model, that is akin to a C struct. Our interval datatype conveniently falls in that category, and here is a possible data model for it:

from numba.extending import models, register_model

@register_model(IntervalType)
class IntervalModel(models.StructModel):
    def __init__(self, dmm, fe_type):
        members = [('lo', types.float64),
                   ('hi', types.float64),]
        models.StructModel.__init__(self, dmm, fe_type, members)

This instructs Numba that values of type IntervalType (or any instance thereof) are represented as a structure of two fields lo and hi, each of them a double-precision floating-point number (types.float64).

Note

Mutable types need more sophisticated data models to be able to persist their values after modification. They typically cannot be stored and passed on the stack or in registers like immutable types do.

Exposing data model attributes

We want the data model attributes lo and hi to be exposed under the same names for use in Numba functions. Numba provides a convenience function to do exactly that:

from numba.extending import make_attribute_wrapper

make_attribute_wrapper(IntervalType, 'lo', 'lo')
make_attribute_wrapper(IntervalType, 'hi', 'hi')

This will expose the attributes in read-only mode. As mentioned above, writable attributes don’t fit in this model.

Exposing a property

As the width property is computed rather than stored in the structure, we cannot simply expose it like we did for lo and hi. We have to re-implement it explicitly:

from numba.extending import overload_attribute

@overload_attribute(IntervalType, "width")
def get_width(interval):
    def getter(interval):
        return interval.hi - interval.lo
    return getter

You might ask why we didn’t need to expose a type inference hook for this attribute? The answer is that @overload_attribute is part of the high-level API: it combines type inference and code generation in a single API.

Implementing the constructor

Now we want to implement the two-argument Interval constructor:

from numba.extending import lower_builtin
from numba.core import cgutils

@lower_builtin(Interval, types.Float, types.Float)
def impl_interval(context, builder, sig, args):
    typ = sig.return_type
    lo, hi = args
    interval = cgutils.create_struct_proxy(typ)(context, builder)
    interval.lo = lo
    interval.hi = hi
    return interval._getvalue()

There is a bit more going on here. @lower_builtin decorates the implementation of the given callable or operation (here the Interval constructor) for some specific argument types. This allows defining type-specific implementations of a given operation, which is important for heavily overloaded functions such as len().

types.Float is the class of all floating-point types (types.float64 is an instance of types.Float). It is generally more future-proof to match argument types on their class rather than on specific instances (however, when returning a type – chiefly during the type inference phase –, you must usually return a type instance).

cgutils.create_struct_proxy() and interval._getvalue() are a bit of boilerplate due to how Numba passes values around. Values are passed as instances of llvmlite.ir.Value, which can be too limited: LLVM structure values especially are quite low-level. A struct proxy is a temporary wrapper around a LLVM structure value allowing to easily get or set members of the structure. The _getvalue() call simply gets the LLVM value out of the wrapper.

Boxing and unboxing

If you try to use an Interval instance at this point, you’ll certainly get the error “cannot convert Interval to native value”. This is because Numba doesn’t yet know how to make a native interval value from a Python Interval instance. Let’s teach it how to do it:

from numba.extending import unbox, NativeValue
from contextlib import ExitStack

@unbox(IntervalType)
def unbox_interval(typ, obj, c):
    """
    Convert a Interval object to a native interval structure.
    """
    is_error_ptr = cgutils.alloca_once_value(c.builder, cgutils.false_bit)
    interval = cgutils.create_struct_proxy(typ)(c.context, c.builder)

    with ExitStack() as stack:
        lo_obj = c.pyapi.object_getattr_string(obj, "lo")
        with cgutils.early_exit_if_null(c.builder, stack, lo_obj):
            c.builder.store(cgutils.true_bit, is_error_ptr)
        lo_native = c.unbox(types.float64, lo_obj)
        c.pyapi.decref(lo_obj)
        with cgutils.early_exit_if(c.builder, stack, lo_native.is_error):
            c.builder.store(cgutils.true_bit, is_error_ptr)

        hi_obj = c.pyapi.object_getattr_string(obj, "hi")
        with cgutils.early_exit_if_null(c.builder, stack, hi_obj):
            c.builder.store(cgutils.true_bit, is_error_ptr)
        hi_native = c.unbox(types.float64, hi_obj)
        c.pyapi.decref(hi_obj)
        with cgutils.early_exit_if(c.builder, stack, hi_native.is_error):
            c.builder.store(cgutils.true_bit, is_error_ptr)

        interval.lo = lo_native.value
        interval.hi = hi_native.value

    return NativeValue(interval._getvalue(), is_error=c.builder.load(is_error_ptr))

Unbox is the other name for “convert a Python object to a native value” (it fits the idea of a Python object as a sophisticated box containing a simple native value). The function returns a NativeValue object which gives its caller access to the computed native value, the error bit and possibly other information.

The snippet above makes abundant use of the c.pyapi object, which gives access to a subset of the Python interpreter’s C API. Note the use of early_exit_if_null to detect and handle any errors that may have happened when unboxing the object (try passing Interval('a', 'b') for example).

We also want to do the reverse operation, called boxing, so as to return interval values from Numba functions:

from numba.extending import box

@box(IntervalType)
def box_interval(typ, val, c):
    """
    Convert a native interval structure to an Interval object.
    """
    ret_ptr = cgutils.alloca_once(c.builder, c.pyapi.pyobj)
    fail_obj = c.pyapi.get_null_object()

    with ExitStack() as stack:
        interval = cgutils.create_struct_proxy(typ)(c.context, c.builder, value=val)
        lo_obj = c.box(types.float64, interval.lo)
        with cgutils.early_exit_if_null(c.builder, stack, lo_obj):
            c.builder.store(fail_obj, ret_ptr)

        hi_obj = c.box(types.float64, interval.hi)
        with cgutils.early_exit_if_null(c.builder, stack, hi_obj):
            c.pyapi.decref(lo_obj)
            c.builder.store(fail_obj, ret_ptr)

        class_obj = c.pyapi.unserialize(c.pyapi.serialize_object(Interval))
        with cgutils.early_exit_if_null(c.builder, stack, class_obj):
            c.pyapi.decref(lo_obj)
            c.pyapi.decref(hi_obj)
            c.builder.store(fail_obj, ret_ptr)

        # NOTE: The result of this call is not checked as the clean up
        # has to occur regardless of whether it is successful. If it
        # fails `res` is set to NULL and a Python exception is set.
        res = c.pyapi.call_function_objargs(class_obj, (lo_obj, hi_obj))
        c.pyapi.decref(lo_obj)
        c.pyapi.decref(hi_obj)
        c.pyapi.decref(class_obj)
        c.builder.store(res, ret_ptr)

    return c.builder.load(ret_ptr)

Using it

nopython mode functions are now able to make use of Interval objects and the various operations you have defined on them. You can try for example the following functions:

from numba import njit

@njit
def inside_interval(interval, x):
    return interval.lo <= x < interval.hi

@njit
def interval_width(interval):
    return interval.width

@njit
def sum_intervals(i, j):
    return Interval(i.lo + j.lo, i.hi + j.hi)

Conclusion

We have shown how to do the following tasks:

  • Define a new Numba type class by subclassing the Type class

  • Define a singleton Numba type instance for a non-parametric type

  • Teach Numba how to infer the Numba type of Python values of a certain class, using typeof_impl.register

  • Teach Numba how to infer the Numba type of the Python type itself, using as_numba_type.register

  • Define the data model for a Numba type using StructModel and register_model

  • Implementing a boxing function for a Numba type using the @box decorator

  • Implementing an unboxing function for a Numba type using the @unbox decorator and the NativeValue class

  • Type and implement a callable using the @type_callable and @lower_builtin decorators

  • Expose a read-only structure attribute using the make_attribute_wrapper convenience function

  • Implement a read-only property using the @overload_attribute decorator