API

The ElcoreNN API consists of the following parts:

elcorenn/elcorenn.h: ElcoreNN CPU library API.

ElcoreNN CPU library API

elcorenn/elcorenn.h defines the ElcoreNN CPU library API functions.

typedef unsigned int ENNModelId: Model ID.

ENNModelId is a model descriptor. That descriptor refers to the model loaded into DSP (Elcore50) memory. Use LoadModel function to easily load model into DSP memory.

enum class ENNDataType

Describes data type should be equal to ElcoreNN data types.

Values:

enumerator FLOAT32

enumerator FLOAT16

enumerator INT32

enumerator UINT32

enumerator UINT8

enumerator INT8

ENNDataType is a data type enumerator. It describes the types of input data used in InvokeModel function. It also sets type of calculations during inference.

enum class ENNHeapSize

Heap size allocated by DSP.

Values:

enumerator Size_64MB

enumerator Size_128MB

enumerator Size_256MB

enumerator Size_512MB

enumerator Size_1GB

enumerator Size_2GB

enumerator Size_3GB

void InitBackend(ENNHeapSize dsp_heap = ENNHeapSize::Size_512MB)

Initializes internal DSPs resources. Use all cores.

Parameters:: dsp_heap – [in] Heap size allocated by each DSP

void InitBackend(uint32_t devices_count, uint32_t *devices, ENNHeapSize dsp_heap = ENNHeapSize::Size_512MB)

Initializes internal DSPs resources.

Parameters:

devices_count – [in] The number of selected devices
devices – [in] The indices of DSP cores to use for prediction, from 0 to DSP cores number (not included)
dsp_heap – [in] Heap size allocated by each DSP

void ReleaseDevice(): Releases internal DSPs resources.

ENNModelId LoadModel(const char *model_json, const char *model_weights, ENNDataType optimization = ENNDataType::FLOAT16)

Loads model from files.

Parameters:

model_json – [in] The description of the model saved in json format
model_weights – [in] The binary file of model’s weights
optimization – [in] Data type for model optimization (default float16)

Returns:

Model ID

The function takes paths for model description and model weights files. Optional parameter optimization is used for model data type conversion. Available two data types: float32 (model loaded without changes) and float16 (model optimized for DSP inference), default value is float16.

int GetInputsNumber(ENNModelId model_id)

Get number of model inputs.

Parameters:: model_id – [in] Model ID
Returns:: Number of model inputs

int GetOutputsNumber(ENNModelId model_id)

Get number of model outputs.

Parameters:: model_id – [in] Model ID
Returns:: Number of model outputs

const char *GetInputName(ENNModelId model_id, uint32_t layer_idx)

Get input name from a model by input layer index.

Parameters:

model_id – [in] Model ID
layer_idx – [in] Input layer index in the model

Returns:

Input layer name specified by index

void GetInputShape(ENNModelId model_id, uint32_t layer_idx, uint32_t *shape)

Get input shape from a model by input layer index. First position in array is dimensions number of the layer.

Parameters:

model_id – [in] Model ID
layer_idx – [in] Input layer index for the model
shape – [out] Array to save model layer shape (It should be at least [MAX_NDIM_VALUE + 1] values. Arrangement of array values is as follows: First position - number of input tensor dimensions; All the following values - dimensions of each tensor axis)

const char *GetOutputName(ENNModelId model_id, uint32_t layer_idx)

Get output name from a model by output layer index.

Parameters:

model_id – [in] Model ID
layer_idx – [in] Output layer index in the model

Returns:

Output layer name specified by index

void GetOutputShape(ENNModelId model_id, uint32_t layer_idx, uint32_t *shape)

Get output shape from a model by output layer index. First position in array is dimensions number of the layer.

Parameters:

model_id – [in] Model ID
layer_idx – [in] Output layer index for a model
shape – [out] Array to save model layer shape (It should be at least [MAX_NDIM_VALUE + 1] values. Arrangement of array values is as follows: First position - number of output tensor dimensions; All the following values - dimensions of each tensor axis)

void InvokeModel(ENNModelId model_id, void **input_data, ENNDataType *input_data_type, float **output_data, uint32_t batch_size)

Runs model inference (input user pointers array, output user pointes array).

Parameters:

model_id – [in] Model ID
input_data – [in] Array of pointers for each input data arrays
input_data_type – [in] Array of data types for each input data arrays
output_data – [out] Array of pointers for each output data arrays
batch_size – [in] Batch size

The function takes model inputs as an arrays of types defined in input_data_type array and puts network’s result into a float arrays.

void InvokeModel(ENNModelId model_id, float **input_data, float **output_data, uint32_t batch_size)

Runs model inference (input pointers array, output user pointers array).

Parameters:

model_id – [in] Model ID
input_data – [in] Array of pointers for each input data arrays
output_data – [out] Array of pointers for each output data arrays
batch_size – [in] Batch size

The function takes model inputs as a float arrays and puts network’s result into a float arrays.

void InvokeModel(ENNModelId model_id, int *input_dmabuf_fd_array, float **output_data, uint32_t batch_size)

Runs model inference (input dmabuf, output user pointers array).

Parameters:

model_id – [in] Model ID
input_dmabuf_fd_array – [in] Input data as array of int’s dmabuf file descriptors
output_data – [out] Array of pointers for each output data arrays
batch_size – [in] Batch size dimension in input data

The function takes model inputs as integer arrays of dmabuf file descriptors and puts network’s: result into a float arrays. (Limitation: only one model input, input data should be float).

void InvokeModel(ENNModelId model_id, int *input_dmabuf_fd_array, int *output_dmabuf_fd_array, uint32_t batch_size)

Runs model inference (input dmabuf, output dmabuf).

Parameters:

model_id – [in] Model ID
input_dmabuf_fd_array – [in] Input data as array of int’s dmabuf file descriptors
output_dmabuf_fd_array – [out] Output data as array of int’s dmabuf file descriptors
batch_size – [in] Batch size dimension in input data

The function takes model inputs as integer arrays of dmabuf file descriptors and puts network’s: result into a dmabufs. (Limitation: only one model input, only one model output, input data should be float).

Note

ElcoreNN uses the data parallelization paradigm. Input data distributes between DSP cores by batch dimension.

void SaveModelStatisticToCSV(ENNModelId model_id, const char *file_path)

Saves model time statistic to a CSV file.

Parameters:

model_id – [in] Model ID
file_path – [in] Path to CSV file

During inference, ElcoreNN collects number of cores cycles instructions for each layer. This can be useful for performance analysis.