Version 0.59.0 (31 January 2024)
Table of Contents
This is a major Numba release. Numba now supports Python 3.12, please find a summary of all noteworthy items below.
Highlights
Python 3.12 Support
The standout feature of this release is the official support for Python 3.12 in Numba.
Please note that profiling support is temporarily disabled in this release (for Python 3.12) and several known issues have been identified during development. The Numba team is actively working on resolving them. Please refer to the respective issue pages (Numba #9289 and Numba #9291) for a list of ongoing issues and updates on progress.
(PR-#9246)
New Features
Improvements
Add TargetLibraryInfo
pass to CPU LLVM pipeline.
The TargetLibraryInfo
pass makes sure that the optimisations that take place
during call simplification are appropriate for the target, without this the
target is assumed to be Linux and code will be optimised to produce e.g. math
symbols that do not exit on Windows. Historically this issue has been avoided
through the use of Numba internal libraries carrying wrapped symbols, but doing
so potentially detriments performance. As a result of this change Numba internal
libraries are smaller and there is an increase in optimisation opportunity in
code using exp2
and log2
functions.
(PR-#9336)
Numba deprecation warning classes are now subclasses of builtin ones
To help users manage and suppress deprecation warnings from Numba, the
NumbaDeprecationWarning
and NumbaPendingDeprecationWarning
classes are
now subclasses of the builtin DeprecationWarning
and
PendingDeprecationWarning
respectively. Therefore, warning filters on
DeprecationWarning
and PendingDeprecationWarning
will apply to Numba
deprecation warnings.
(PR-#9347)
NumPy Support
Added support for np.polynomial.polynomial.Polynomial
class.
Support is added for the Polynomial class from the package np.polynomial.polynomial
.
(PR-#9140)
Added support for functions np.polynomial.polyutils.as_series()
, as well as functions polydiv()
, polyint()
, polyval()
from np.polynomial.polynomial
.
Support is added for np.polynomial.polyutils.as_series()
, np.polynomial.polynomial.polydiv()
, np.polynomial.polynomial.polyint()
(only the first 2 arguments), np.polynomial.polynomial.polyval()
(only the first 2 arguments).
(PR-#9141)
CUDA API Changes
Added support for compiling device functions with a C ABI
Support for compiling device functions with a C ABI through the
compile_ptx()
API, for easier interoperability
with CUDA C/C++ and other languages.
(PR-#9223)
Make grid() and gridsize() use 64-bit integers
cuda.grid()
and cuda.gridsize()
now use 64-bit integers, so they no longer
overflow when the grid contains more than 2 ** 31
threads.
(PR-#9235)
Bug Fixes
Dynamically Allocate Parfor Schedules
This PR fixes an issue where a parallel region is executed in a loop many times. The previous code used an alloca to allocate the parfor schedule on the stack but if there are many such parfors in a loop then the stack will overflow. The new code does a pair of allocation/deallocation calls into the Numba parallel runtime before and after the parallel region respectively. At the moment, these calls redirect to malloc/free although other mechanisms such as pooling are possible and may be implemented later. This PR also adds a warning in cases where a prange loop is not converted to a parfor. This can happen if there is exceptional control flow in the loop. These are related in that the original issue had a prange loop that wasn’t converted to a parfor and therefore all the parfors inside the body of the prange were running in parallel and adding to the stack each time.
(PR-#9048)
Support multiple outputs in a @guvectorize
function
This PR fixes Numba #9058 where it is now possible to call a guvectorize with multiple outputs.
(PR-#9049)
Handling of None
args fixed in PythonAPI.call
.
Fixing segfault when args=None
was passed to PythonAPI.call
.
(PR-#9089)
Fix propagation of literal values in PHI nodes.
Fixed a bug in the literal propagation pass where a PHI node could be wrongly replaced by a constant.
(PR-#9144)
numpy.digitize
implementation behaviour aligned with numpy
The implementation of numpy.digitize
is updated to behave per
numpy in a wider set of cases, including where the supplied bins
are not in fact monotonic.
(PR-#9169)
numpy.searchsorted
and numpy.sort
behaviour updates
numpy.searchsorted
implementation updated to produce identical outputs to numpy for a wider set of use cases, including where the provided array a is in fact not properly sorted.numpy.searchsorted
implementation bugfix for the case where side=’right’ and the provided array a contains NaN(s).numpy.searchsorted
implementation extended to support complex inputs.numpy.sort
(andarray.sort
) implementation extended to support sorting of complex data.
(PR-#9189)
Fix SSA to consider variables where use is not dominated by the definition
A SSA problem is fixed such that a conditionally defined variable will receive a phi node showing that there is a path where the variable is undefined. This affects extension code that relies on SSA behavior.
(PR-#9242)
Fixed RecursionError
in prange
A problem with certain loop patterns using prange
leading to
RecursionError
in the compiler is fixed. An example of such loop is shown
below. The problem would cause the compiler to fall into an infinite recursive
cycle trying to determine the definition of var1
and var2
. The pattern
involves definitions of variables within an if-else tree and not all branches
are defining the variables.
for i in prange(N):
for j in inner:
if cond1:
var1 = ...
elif cond2:
var1, var2 = ...
elif cond3:
pass
if cond4:
use(var1)
use(var2)
(PR-#9244)
Support negative axis in ufunc.reduce
Fixed a bug in ufunc.reduce to correctly handle negative axis values.
(PR-#9296)
Fix issue with parfor reductions and Python 3.12.
The parfor reduction code has certain expectations on the order of statements that it discovers, these are based on the code that previous versions of Numba generated. With Python 3.12, one assignment that used to follow the reduction operator statement, such as a binop, is now moved to its own basic block. This change reorders the set of discovered reduction nodes so that this assignment is right after the reduction operator as it was in previous Numba versions. This only affects internal parfor reduction code and doesn’t actually change the Numba IR.
(PR-#9334)
Changes
Make test listing not invoke CPU compilation.
Numba’s test listing command python -m numba.runtests -l
has historically
triggered CPU target compilation due to the way in which certain test functions
were declared within the test suite. It has now been made such that the CPU
target compiler is not invoked on test listing and a test is added to ensure
that it remains the case.
(PR-#9309)
Semantic differences due to Python 3.12 variable shadowing in comprehensions
Python 3.12 introduced a new bytecode LOAD_FAST_AND_CLEAR
that is only used
in comprehensions. It has dynamic semantics that Numba cannot model.
For example,
def foo():
if False:
x = 1
[x for x in (1,)]
return x # This return uses undefined variable
The variable x is undefined at the return statement. Instead of raising an
UnboundLocalError
, Numba will raise a TypingError
at compile time if an
undefined variable is used.
However, Numba cannot always detect undefined variables.
For example,
def foo(a):
[x for x in (0,)]
if a:
x = 3 + a
x += 10
return x
Calling foo(0)
returns 10
instead of raising UnboundLocalError
.
This is because Numba does not track variable liveness at runtime.
The return value is 0 + 10
since Numba zero-initializes undefined variables.
(PR-#9315)
Refactor and remove legacy APIs/testing internals.
A number of internally used functions have been removed to aid with general maintenance by reducing the number of ways in which it is possible to invoke compilation, specifically:
numba.core.compiler.compile_isolated
is removed.numba.tests.support.TestCase::run_nullary_func
is removed.numba.tests.support.CompilationCache
is removed.
Additionally, the concept of “nested context” is removed from
numba.core.registry.CPUTarget
along with the implementation details.
Maintainers of target extensions (those using the
API in numba.core.target_extension
to extend Numba support to
custom/synthetic hardware) should note that the same can be deleted from
target extension implementations of numba.core.descriptor.TargetDescriptor
if it is present. i.e. the nested_context
method and associated
implementation details can just be removed from the custom target’s
TargetDescriptor
.
Further, a bug was discovered, during the refactoring, in the typing of record arrays. It materialised that two record types that only differed in their mutability could alias, this has now been fixed.
(PR-#9330)
Deprecations
Explicitly setting NUMBA_CAPTURED_ERRORS=old_style
will raise deprecation warnings
As per deprecation schedule of old-style error-capturing, explicitly setting
NUMBA_CAPTURED_ERRORS=old_style
will raise deprecation warnings.
This release is the last to use “old_style” as the default.
Details are documented at
https://numba.readthedocs.io/en/0.58.1/reference/deprecation.html#deprecation-of-old-style-numba-captured-errors
(PR-#9346)
Expired Deprecations
Object mode fall-back support has been removed.
As per the deprecation schedule for Numba 0.59.0, support for
“object mode fall-back” is removed from all Numba jit
-family decorators.
Further, the default for the nopython
key-word argument has been changed to
True
, this means that all Numba jit
-family decorated functions will now
compile in nopython
mode by default.
(PR-#9352)
Pull-Requests:
PR #8990: Removed extra block copying in InlineWorker (kc611)
PR #9058: Fix gufunc with multiple outputs (guilhermeleobas)
PR #9089: Fix segfault on passing None for args in PythonAPI.call (hellozee)
PR #9101: Add misc script to find missing towncrier news files (sklam)
PR #9123: Implement most ufunc attributes and ufunc.reduce (guilhermeleobas)
PR #9126: Add support for np.indices() (KrisMinchev)
PR #9140: Add support for Polynomial class (KrisMinchev)
PR #9141: Add support for as_series() from np.polynomial.polyutils and polydiv(), polyint(), polyval() from np.polynomial.polynomial (KrisMinchev)
PR #9142: Removed out of date comment handled by PR#8338 (njriasan)
PR #9144: Fix error when literal is wrongly propagated in a PHI node (guilhermeleobas)
PR #9148: bump llvmdev dependency to 0.42.0dev for next development cycle (esc)
PR #9154: Add support for np.unwrap() (KrisMinchev)
PR #9168: fix the get_template_info method in overload_method template (dlee992 sklam)
PR #9169: Update np.digitize handling of np.nan bin edge(s) (rjenc29)
PR #9170: Fix an inappropriate test expression to remove a logical short circuit (munahaf)
PR #9171: Fix the implementation of a special method (munahaf)
PR #9191: Add a Numba power-on-self-test script and use in CI. (stuartarchibald)
PR #9205: release notes and version support updates from release0.58 branch (esc)
PR #9223: CUDA: Add support for compiling device functions with C ABI (gmarkall)
PR #9235: CUDA: Make grid() and gridsize() use 64-bit integers (gmarkall)
PR #9246: Support for Python 3.12 (stuartarchibald kc611 esc)
PR #9249: add support for checking dtypes equal (saulshanabrook)
PR #9255: Fix SSA to consider variables whose use is not dominated by the definition (sklam)
PR #9267: CUDA: Fix dropping of kernels by nvjitlink, by implementing the used list (gmarkall)
PR #9279: CUDA: Add support for CUDA 12.0 Windows conda packages (gmarkall)
PR #9292: CUDA: Switch cooperative groups to use overloads (gmarkall)
PR #9296: Fix bug when axis is negative and check when axis is invalid (guilhermeleobas)
PR #9302: add missing backtick to example git tag command (esc)
PR #9309: Continue #9044, prevent compilation on the CPU target when listing tests. (stuartarchibald apmasell)
PR #9310: Remove Python 3.8 support. (stuartarchibald)
PR #9330: Refactor and remove legacy APIs/testing internals. (stuartarchibald)
PR #9331: Fix Syntax and Deprecation Warnings from 3.12. (stuartarchibald)
PR #9334: Fix parfor reduction issue with Python 3.12. (DrTodd13)
PR #9335: Add validation capability for user generated towncrier .rst files. (kc611)
PR #9336: Add TargetLibraryInfo pass to CPU LLVM pipeline. (stuartarchibald)
PR #9337: Revert #8583 which skip tests due to M1 RuntimeDyLd Assertion error (sklam)
PR #9341: Add configuration variable to force llvmlite memory manager on / off (gmarkall)
PR #9346: Setting
NUMBA_CAPTURED_ERRORS=old_style
will now raise warnings. (sklam)PR #9347: Make Numba’s deprecation warnings subclasses of the builtin ones. (sklam)
PR #9351: Made Python 3.12 support rst note more verbose (kc611)
PR #9352: Removing object mode fallback from @jit. (stuartarchibald)
PR #9353: Remove numba.generated_jit (stuartarchibald)
PR #9356: Refactor print tests to avoid NRT leak issue. (stuartarchibald)
PR #9357: Fix a typo in _set_init_process_lock warning. (stuartarchibald)
PR #9358: Remove note about OpenMP restriction in wheels. (stuartarchibald)
PR #9359: Fix test_jit_module test against objmode fallback. (stuartarchibald)
PR #9360: AzureCI changes. RVSDG test config should still test its assigned test slice (sklam)
PR #9402: Doc updates for 0.59 final (sklam stuartarchibald)
PR #9403: Fix test isolation for stateful configurations in the testsuite (sklam stuartarchibald)
PR #9404: Fix skipped test stderr change for Python 3.12.1. (stuartarchibald)