### 10 breaking releases

0.12.0 | Jan 28, 2023 |
---|---|

0.10.0 | Apr 23, 2022 |

0.3.0 | Mar 28, 2022 |

0.1.0 | Dec 19, 2021 |

#**124** in Machine learning

**90** downloads per month

Used in compiled-nn

**MIT**license

2MB

**38K**
SLoC

# CompiledNN: A JIT Compiler for Neural Network Inference

## Features

- compiles Keras HDF5 models into machine code
- generates single-threaded code for x86/64 processors with SSSE3/SSE4

## Dependencies

- HDF5 (C bindings only)

## Compiling

CompiledNN can be compiled into a library via CMake:

`mkdir`` build`
`cd`` build`
`cmake`` ..`
`make`
`make`` install`

Another way to integrate CompiledNN is to add it (and its dependency AsmJit) as source files to your project.

## Supported layers

- Core
- Dense
- Activation
- Dropout
- Flatten
- Reshape (does not support dimension inference, i.e. specifying -1 as dimension is not allowed)

- Convolutional
- Conv2D (only with

)`dilation_rate``=``1` - SeparableConv2D (only with

and`dilation_rate``=``1`

)`depth_multiplier``=``1` - DepthwiseConv2D (only with

,`dilation_rate``=``1`

,`depth_multiplier``=``1`

and`use_bias``=`False

)`activation``=``None` - UpSampling2D (only with

, number of channels must be at most 32/64 and divisible by 4)`interpolation``=`nearest - ZeroPadding2D (number of channels per row must be divisible by 4)

- Conv2D (only with
- Pooling
- MaxPooling2D
- AveragePooling2D
- GlobalMaxPooling2D (at most 28/60 channels)
- GlobalAveragePooling2D (at most 28/60 channels)

- Merge
- Add
- Subtract
- Multiply
- Average
- Maximum
- Minimum
- Concatenate

- Advanced Activations
- LeakyReLU
- ELU
- ThresholdedReLU
- Softmax
- ReLU

- Normalization
- BatchNormalization

## Example

`#include` `<`CompiledNN/Model.h`>`
#include `<`CompiledNN/CompiledNN.h`>`
using namespace NeuralNetwork;
int main()
{
Model model;
model.load(`"`model.h5`"`);
`//` Optionally, indicate which input tensors should be converted from unsigned chars to floats in the beginning.
// model.setInputUInt8(0);
CompiledNN nn;
nn.compile(model);
// ... fill nn.input(i) with data
nn.apply();
// ... obtain the results from nn.output(i)
return 0;
}

#### No runtime deps

~0–1.7MB

~35K SLoC