CUDA-Specific Types


This page is about types specific to CUDA targets. Many other types are also available in the CUDA target - see Built-in types.

Vector Types

CUDA Vector Types are usable in kernels. There are two important distinctions from vector types in CUDA C/C++:

First, the recommended names for vector types in Numba CUDA is formatted as <base_type>x<N>, where base_type is the base type of the vector, and N is the number of elements in the vector. Examples include int64x3, uint16x4, float32x4, etc. For new Numba CUDA kernels, this is the recommended way to instantiate vector types.

For convenience, users adapting existing kernels from CUDA C/C++ to Python may use aliases consistent with the C/C++ namings. For example, float3 aliases float32x3, long3 aliases int32x3 or int64x3 (depending on the platform), etc.

Second, unlike CUDA C/C++ where factory functions are used, vector types are constructed directly with their constructor. For example, to construct a float32x3:

from numba.cuda import float32x3

# In kernel
f3 = float32x3(0.0, -1.0, 1.0)

Additionally, vector types can be constructed from a combination of vector and primitive types, as long as the total number of components matches the result vector type. For example, all of the following constructions are valid:

zero = uint32(0)
u2 = uint32x2(1, 2)
# Construct a 3-component vector with primitive type and a 2-component vector
u3 = uint32x3(zero, u2)
# Construct a 4-component vector with 2 2-component vectors
u4 = uint32x4(u2, u2)

The 1st, 2nd, 3rd and 4th component of the vector type can be accessed through fields x, y, z, and w respectively. The components are immutable after construction in the present version of Numba; it is expected that support for mutating vector components will be added in a future release.

v1 = float32x2(1.0, 1.0)
v2 = float32x2(1.0, -1.0)
dotprod = v1.x * v2.x + v1.y * v2.y