Introducing Numba, A High-Performance Python Compiler

Anaconda recently released Numba, an open just-in-time (JIT) compiler that translates a subset of Python and NumPy code into fast machine code. The new compiler is said to translate Python functions when it is called into a machine code equivalent that runs anywhere from 2x (simple NumPy operations) to 100 (complex Python loops) faster.

Try out Numba here.

One of the most effective methods to use Numba is to apply its one of the decorators to your functions and tell it to compile them. All or a part of your code can thus run at native machine code speed when a call to a Numba-decorated function is made.

Numba works with the following:

OS: Windows (32 and 64 bit), OSX, Linux (32 and 64 bit). Unofficial support on BSD.
Architecture: x86, x86_64, ppc64le, armv7l, armv8l (aarch64). Unofficial support on M1/Arm64.
GPUs: Nvidia CUDA.
CPython
NumPy 1.18 – latest

It is not advised for first-time Numba users to compile Numba from the source code. The dependencies on it are maintained to an absolute minimum because it is frequently used as a core element. However, the following additional packages can be installed to offer more functionality:

Scipy makes it possible to compile Numpy- .linalg functions.
Colorama permits the use of colour highlighting in error messages and backtraces.
pyyaml supports Numba configuration through a YAML configuration file.
The Intel SVML (high-performance short vector math library, x86 64 only) can be used with icc_rt. Performance tips provide installation instructions.

How does it work?

Numba reads the Python bytecode for a decorated function and then mixes it with details on the types of the function’s input parameters. It then uses the LLVM compiler library to create a machine code version of your function that is suited to your CPU capabilities after analysing and optimising your code. This compiled version is used every time your function is called.

Numba will tailor compilation to your particular CPU, assuming it can run in nopython mode or at least compile some loops. Depending on the application, speed increase can range from one to two orders of magnitude.

Numba provides a variety of choices for parallelizing your code for CPUs and GPUs with minor code changes-

Simplified Threading: Numba can automatically execute NumPy array expressions on multiple CPU cores, making it easy to write parallel loops.

SIMD Vectorization: Numba can automatically convert some loops into vector instructions for 2-4x speed increases. Whether your CPU supports SSE, AVX, or AVX-512, Numba adapts to its capabilities.

GPU Acceleration: Numba enables you to create parallel GPU algorithms entirely from Python and supports NVIDIA CUDA.

The post Introducing Numba, A High-Performance Python Compiler appeared first on Analytics India Magazine.

Introducing Numba, A High-Performance Python Compiler

Related Posts

Google Launches Agent Payments Protocol to Standardise AI Transactions

GenAI May Code, But Can it Think Like a Data Scientist?

Google Rolls Out AI Plus to Rival ChatGPT Go