WEB Development
31st May 2019What is Node.js?
2nd September 2019Daal4py makes your Machine Learning algorithms in Python lightning fast and easy to use. It provides highly configurable Machine Learning kernels, some of which support streaming input data and/or can be easily and efficiently scaled out to clusters of workstations. Internally it uses Intel® DAAL (Intel® Data Analytics Acceleration Library) to deliver the best performance.
Designed For Data Scientists And Framework Designers
daal4py was created to give data scientists the easiest way to utilize Intel® DAAL’s powerful machine learning building blocks directly in a high-productivity manner. A simplified API gives high-level abstractions to the user with minimal boilerplate, allowing for quick to write and easy to maintain code when utilizing Jupyter Notebooks. For scaling capabilities, daal4py also provides the ability to do distributed machine learning, giving a quick way to scale out. Its streaming mode provides a flexible mechanism for processing large amounts of data and/or non-contiguous input data.
For framework designers, daal4py’s has been fashioned to be built under other frameworks from both an API and feature perspective. The machine learning models split the training and inference classes, allowing the model to be exported and serialized if desired. This design also gives the flexibility to work directly with the model and associated primitives, allowing one to customize the behavior of the model itself. The daal4py package can be built with customized algorithm load outs, allowing for a smaller footprint of dependencies when necessary.
Daal4py’s Design
The design of daal4py utilizes several different technologies to deliver Intel® DAAL performance in a flexible design to Data Scientists and Framework designers. The package uses Jinja templates to generate Cython-wrapped DAAL C++ headers, with Cython as a bridge between the generated DAAL code and the Python layer. This design allows for quicker development cycles and acts as a reference design to those looking to tailor their build of daal4py. Cython also allows for good Python behavior, both for compatability to different frameworks and for pickling and serialization.
Moreover, two new tools allow you to easily bring your full data analytics pipeline to unprecedented scales: daal4py and HPAT. Daal4py is a convenient Python API to Intel® DAAL (Intel® Data Analytics Acceleration Library). While its interface is scikit-learn-like, its MPI-based engine under the hood allows scaling machine learning algorithms to bare-metal cluster performance with only little code changes. HPAT (High Performance Analytics Toolkit) scales analytics codes using Pandas/Python to bare-metal cluster performance. It automatically compiles a subset of Python (Pandas/Numpy/Daal4py) to efficient parallel binaries with MPI, also requiring only minimal code changes. With these tools your code can be orders of magnitude faster than alternatives like Apache Spark – without the pain of dealing directly with lower-level languages and/or tools like C and/or message passing.
Every machine learning developer, every data scientist, every analyst who uses Python, every numerical and scientific computer developer who just wants to accelerate compute intensive Python packages like NumPy and mpi4py, every HPC developer looking to unlock the power of modern hardware – actually anyone using Python in production, needs Intel’s Distribution for Python.