Version 0.58.0 (20 September 2023)
Table of Contents
This is a major Numba release. Numba now uses towncrier to create the release notes, so please find a summary of all noteworthy items below.
Highlights
Added towncrier
This PR adds towncrier as a GitHub workflow for checking release notes.
From this PR onwards every PR made in Numba will require a appropriate
release note associated with it. The reviewer may decide to skip adding
release notes in smaller PRs with minimal impact by addition of a
skip_release_notes
label to the PR.
(PR-#8792)
The minimum supported NumPy version is 1.22.
Following NEP-0029, the minimum supported NumPy version is now 1.22.
(PR-#9093)
Add support for NumPy 1.25
Extend Numba to support new and changed features released in NumPy 1.25.
(PR-#9011)
Remove NVVM 3.4 and CTK 11.0 / 11.1 support
Support for CUDA toolkits < 11.2 is removed.
(PR-#9040)
Removal of Windows 32-bit Support
This release onwards, Numba has discontinued support for Windows 32-bit operating systems.
(PR-#9083)
The minimum llvmlite version is now 0.41.0.
The minimum required version of llvmlite is now version 0.41.0.
(PR-#8916)
Added RVSDG-frontend
This PR is a preliminary work on adding a RVSDG-frontend for processing bytecode. RVSDG (Regionalized Value-State Dependence Graph) allows us to have a dataflow-centric view instead of a traditional SSA-CFG view. This allows us to simplify the compiler in the future.
(PR-#9012)
New Features
numba.experimental.jitclass
gains support for __*matmul__
methods.
numba.experimental.jitclass
now has support for the following methods:
__matmul__
__imatmul__
__rmatmul__
(PR-#8892)
numba.experimental.jitclass
gains support for reflected -dunder- methods.
numba.experimental.jitclass
now has support for the following methods:
__radd__
__rand_
__rfloordiv__
__rlshift__
__ror_
__rmod_
__rmul_
__rpow_
__rrshift_
__rsub_
__rtruediv_
__rxor_
(PR-#8906)
Add support for value max
to NUMBA_OPT
.
The optimisation level that Numba applies when compiling can be set through the
environment variable NUMBA_OPT
. This has historically been a value between
0 and 3 (inclusive). Support for the value max
has now been added, this is a
Numba-specific optimisation level which indicates that the user would like Numba
to try running the most optimisation possible, potentially trading a longer
compilation time for better run-time performance. In practice, use of the max
level of optimisation may or may not benefit the run-time or compile-time
performance of user code, but it has been added to present an easy to access
option for users to try if they so wish.
(PR-#9094)
Improvements
NumPy Support
All modes are supported in numpy.correlate
and numpy.convolve
.
All values for the mode
argument to numpy.correlate
and
numpy.convolve
are now supported.
(PR-#7543)
@vectorize
accommodates arguments implementing __array_ufunc__
.
Universal functions (ufunc
s) created with numba.vectorize
will now
respect arguments implementing __array_ufunc__
(NEP-13) to allow pre- and
post-processing of arguments and return values when the ufunc is called from the
interpreter.
(PR-#8995)
Added support for np.geomspace
function.
This PR improves on #4074 by
adding support for np.geomspace
. The current implementation only supports
scalar start
and stop
parameters.
(PR-#9068)
Added support for np.vsplit
, np.hsplit
, np.dsplit
.
This PR improves on #4074 by adding support for np.vsplit
, np.hsplit
, and np.dsplit
.
(PR-#9082)
Added support for functions np.polynomial.polyutils.trimseq
, as well as functions polyadd
, polysub
, polymul
from np.polynomial.polynomial
.
Support is added for np.polynomial.polyutils.trimseq
, np.polynomial.polynomial.polyadd
, np.polynomial.polynomial.polysub
, np.polynomial.polynomial.polymul
.
(PR-#9087)
CUDA Changes
Bitwise operation ufunc
support for the CUDA target.
Support is added for some ufunc
s associated with bitwise operation on the
CUDA target. Namely:
numpy.bitwise_and
numpy.bitwise_or
numpy.bitwise_not
numpy.bitwise_xor
numpy.invert
numpy.left_shift
numpy.right_shift
(PR-#8974)
Add support for the latest CUDA driver codes.
Support is added for the latest set of CUDA driver codes.
(PR-#8988)
Add NumPy comparison ufunc in CUDA
this PR adds support for comparison ufuncs for the CUDA target
(eg. numpy.greater
, numpy.greater_equal
, numpy.less_equal
, etc.).
(PR-#9007)
Report absolute path of libcuda.so
on Linux
numba -s
now reports the absolute path to libcuda.so
on Linux, to aid
troubleshooting driver issues, particularly on WSL2 where a Linux driver can
incorrectly be installed in the environment.
(PR-#9034)
Add debuginfo support to nvdisasm
output.
Support is added for debuginfo (source line and inlining information) in
functions that make calls through nvdisasm
. For example the CUDA dispatcher
.inspect_sass
method output is now augmented with this information.
(PR-#9035)
Add CUDA SASS CFG Support
This PR adds support for getting the SASS CFG in dot language format.
It adds an inspect_sass_cfg()
method to CUDADispatcher and the -cfg
flag to the nvdisasm command line tool.
(PR-#9051)
Bug Fixes
Handling of different sized unsigned integer indexes are fixed in numba.typed.List
.
An issue with the order of truncation/extension and casting of unsigned integer
indexes in numba.typed.List
has been fixed.
(PR-#7262)
Prevent invalid fusion
This PR fixes an issue in which an array first read in a parfor and later written in the same parfor would only be classified as used in the parfor. When a subsequent parfor also used the same array then fusion of the parfors was happening which should have been forbidden given that that the first parfor was also writing to the array. This PR treats such arrays in a parfor as being both used and defined so that fusion will be prevented.
(PR-#7582)
The numpy.allclose
implementation now correctly handles default arguments.
The implementation of numpy.allclose
is corrected to use TypingError
to
report typing errors.
(PR-#8885)
Add type validation to numpy.isclose
.
Type validation is added to the implementation of numpy.isclose
.
(PR-#8944)
Fix support for overloading dispatcher with non-compatible first-class functions
Fixes an error caused by not handling compilation error during casting of
Dispatcher
objects into first-class functions. With the fix, users can now
overload a dispatcher with non-compatible first-class functions. Refer to
https://github.com/numba/numba/issues/9071 for details.
(PR-#9072)
Support dtype
keyword argument in numpy.arange
with parallel=True
Fixes parfors transformation to support the use of dtype
keyword argument in
numpy.arange(..., dtype=dtype)
.
(PR-#9095)
Fix all @overload
s to use parameter names that match public APIs.
Some of the Numba @overload
s for functions in NumPy and Python’s built-ins
were written using parameter names that did not match those used in API they
were overloading. The result of this being that calling a function with such a
mismatch using the parameter names as key-word arguments at the call site would
result in a compilation error. This has now been universally fixed throughout
the code base and a unit test is running with a best-effort attempt to prevent
reintroduction of similar mistakes in the future. Fixed functions include:
From Python built-ins:
complex
From the Python random
module:
random.seed
random.gauss
random.normalvariate
random.randrange
random.randint
random.uniform
random.shuffle
From the numpy
module:
numpy.argmin
numpy.argmax
numpy.array_equal
numpy.average
numpy.count_nonzero
numpy.flip
numpy.fliplr
numpy.flipud
numpy.iinfo
numpy.isscalar
numpy.imag
numpy.real
numpy.reshape
numpy.rot90
numpy.swapaxes
numpy.union1d
numpy.unique
From the numpy.linalg
module:
numpy.linalg.norm
numpy.linalg.cond
numpy.linalg.matrix_rank
From the numpy.random
module:
numpy.random.beta
numpy.random.chisquare
numpy.random.f
numpy.random.gamma
numpy.random.hypergeometric
numpy.random.lognormal
numpy.random.pareto
numpy.random.randint
numpy.random.random_sample
numpy.random.ranf
numpy.random.rayleigh
numpy.random.sample
numpy.random.shuffle
numpy.random.standard_gamma
numpy.random.triangular
numpy.random.weibull
(PR-#9099)
Changes
Support for @numba.extending.intrinsic(prefer_literal=True)
In the high level extension API, the prefer_literal
option is added to the
numba.extending.intrinsic
decorator to prioritize the use of literal types
when available. This has the same behavior as in the prefer_literal
option in the numba.extending.overload
decorator.
(PR-#6647)
Deprecations
Deprecation of old-style NUMBA_CAPTURED_ERRORS
Added deprecation schedule of NUMBA_CAPTURED_ERRORS=old_style
.
NUMBA_CAPTURED_ERRORS=new_style
will become the default in future releases.
Details are documented at
https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-old-style-numba-captured-errors
(PR-#9090)
Pull-Requests
PR #6647: Support prefer_literal option for intrinsic decorator (ashutoshvarma sklam)
PR #7543: Support for all modes in np.correlate and np.convolve (jeertmans)
PR #7582: Use get_parfor_writes to detect illegal array access that prevents fusion. (DrTodd13)
PR #8462: Add PyBytes_AsString and PyBytes_AsStringAndSize (ianna)
PR #8633: DOC: Convert vectorize and guvectorize examples to doctests (Matt711)
PR #8854: Updated mk_alloc to support Numba-Dpex compute follows data. (mingjie-intel)
PR #8861: CUDA: Don’t add device kwarg for jit registry (gmarkall)
PR #8871: Don’t return the function in CallConv.decorate_function() (gmarkall)
PR #8885: Fix np.allclose not handling default args (guilhermeleobas)
PR #8892: Add support for __*matmul__ methods in jitclass (louisamand)
PR #8895: CUDA: Enable caching functions that use CG (gmarkall)
PR #8906: Add support for reflected dunder methods in jitclass (louisamand)
PR #8911: Remove isinstance experimental feature warning (guilhermeleobas)
PR #8937: Remove old Website development documentation (esc gmarkall)
PR #8944: Add exceptions to np.isclose (guilhermeleobas)
PR #8976: Fix index URL for ptxcompiler/cubinlinker packages. (bdice)
PR #8988: support for latest CUDA driver codes #8363 (s1Sharp)
PR #8995: Allow libraries that implement __array_ufunc__ to override DUFunc.__c… (jpivarski)
PR #9021: update the release checklist following 0.57.1rc1 (esc)
PR #9022: fix: update the C++ ABI repo reference (emmanuel-ferdman)
PR #9028: Replace use of imp module removed in 3.12 (hauntsaninja)
PR #9034: CUDA libs test: Report the absolute path of the loaded libcuda.so on Linux, + other improvements (gmarkall)
PR #9035: CUDA: Allow for debuginfo in nvdisasm output (Matt711)
PR #9037: Recognize additional functions as being pure or not having side effects. (DrTodd13)
PR #9039: Correct git clone link in installation instructions. (ellifteria)
PR #9040: Remove NVVM 3.4 and CTK 11.0 / 11.1 support (gmarkall)
PR #9046: copy the change log changes for 0.57.1 to main (esc)
PR #9068: Adding np.geomspace (KrisMinchev)
PR #9069: Fix towncrier error due to importlib_resources upgrade (sklam)
PR #9072: Fix support for overloading dispatcher with non-compatible first-class functions (gmarkall sklam)
PR #9074: Add np.trim_zeros (sungraek guilhermeleobas)
PR #9082: Add np.vsplit, np.hsplit, and np.dsplit (KrisMinchev)
PR #9083: Removed windows 32 references from code and documentation (kc611)
PR #9085: Add tests for np.row_stack (KrisMinchev)
PR #9086: Support NVRTC using ctypes binding (testhound gmarkall)
PR #9087: Add trimseq from np.polynomial.polyutils and polyadd, polysub, polymul from np.polynomial.polynomial (KrisMinchev)
PR #9088: Fix: Issue 9063 - CUDA atomics tests failing with CUDA 12.2 (gmarkall)
PR #9090: Add deprecation notice for old_style error capturing. (esc sklam)
PR #9094: Add support for a ‘max’ level to NUMBA_OPT environment variable. (stuartarchibald)
PR #9095: Support dtype keyword in arange_parallel_impl (DrTodd13 sklam)
PR #9105: NumPy 1.25 support (PR #9011) continued (gmarkall apmasell)
PR #9111: Fixes ReST syntax error in PR#9099 (stuartarchibald gmarkall sklam apmasell)
PR #9112: Fixups for PR#9100 (stuartarchibald sklam)
PR #9113: Add support for np.diagflat (KrisMinchev)
PR #9114: update np min to 122 (stuartarchibald esc)
PR #9118: Add support for np.resize() (KrisMinchev)
PR #9127: Fix accidental cffi test deps, refactor cffi skipping (gmarkall)
PR #9152: Fix old_style error capturing deprecation warnings (sklam)
PR #9173: Towncrier fixups (Continue #9158 and retarget to main branch) (sklam)
PR #9190: Fix issue with incompatible multiprocessing context in test. (stuartarchibald)