API

The ElcoreNN API consists of the following parts:

elcorenn/elcorenn.h: ElcoreNN CPU library API.

ElcoreNN CPU library API

elcorenn/elcorenn.h defines the ElcoreNN CPU library API functions.

typedef unsigned int ENNBackendId: Backend ID.

ENNBackendId is a ElcoreNN backend descriptor. ElcoreNN backend is a runtime to load and invoke models.

typedef unsigned int ENNModelId: Model ID.

ENNModelId is a model descriptor. That descriptor refers to the model loaded to ElcoreNN backend.

enum class ENNDataType

Describes data type should be equal to ElcoreNN data types.

Values:

enumerator FLOAT32

enumerator FLOAT16

enumerator INT32

enumerator UINT32

enumerator UINT8

enumerator INT8

ENNDataType is a data type enumerator. It describes the types of input data used in InvokeModel function. It also sets type of calculations during inference.

enum class ENNHeapSize

Heap size allocated by DSP.

Values:

enumerator Size_64MB

enumerator Size_128MB

enumerator Size_256MB

enumerator Size_512MB

enumerator Size_1GB

enumerator Size_2GB

enumerator Size_3GB

ENNBackendId InitBackend(ENNHeapSize dsp_heap = ENNHeapSize::Size_512MB)

Initializes internal DSPs resources. DSPs are not statically assigned and are chosen dynamically by DSP scheduler when executing models. Execution is paralleled on all available DSPs: each DSP computes its own data portion divided by batch.

Parameters:: dsp_heap – [in] Heap size allocated by each DSP.
Returns:: Backend ID

ENNBackendId InitBackend(uint32_t devices_count, ENNHeapSize dsp_heap = ENNHeapSize::Size_512MB)

Initializes internal DSPs resources. DSPs are not statically assigned and are chosen dynamically by DSP scheduler when executing models. Execution is paralleled on device_count DSPs: each DSP computes its own data portion divided by batch.

Parameters:

devices_count – [in] Number of selected devices.
dsp_heap – [in] Heap size allocated by each DSP.

Returns:

Backend ID

ENNBackendId InitBackend(uint32_t devices_count, uint32_t *devices, ENNHeapSize dsp_heap = ENNHeapSize::Size_512MB)

Initializes internal DSPs resources. Statically assign DSPs ids to execute models.

Parameters:

devices_count – [in] Number of selected devices.
devices – [in] Indices of DSP cores.
dsp_heap – [in] Heap size allocated by each DSP.

Returns:

Backend ID

void ReleaseBackend(ENNBackendId backend_id)

Release internal backend resources and all backend models.

Parameters:: backend_id – [in] Backend ID

ENNModelId LoadModel(const char *model_json, const char *model_weights, ENNDataType optimization = ENNDataType::FLOAT16, ENNBackendId backend_id = 0)

Loads model from files.

Parameters:

model_json – [in] The description of the model saved in json format
model_weights – [in] The binary file of model’s weights
optimization – [in] Data type for model optimization (default float16)
backend_id – [in] Backend used to load model

Returns:

Model ID

The function takes paths for model description and model weights files. Optional argument optimization is used for model data type conversion. Available two data types: float32 (model loaded without changes) and float16 (model optimized for DSP inference), default value is float16. backend_id argument refers to ElcoreNN backend which is used to load model.

int GetInputsNumber(ENNModelId model_id)

Get number of model inputs.

Parameters:: model_id – [in] Model ID
Returns:: Number of model inputs

int GetOutputsNumber(ENNModelId model_id)

Get number of model outputs.

Parameters:: model_id – [in] Model ID
Returns:: Number of model outputs

const char *GetInputName(ENNModelId model_id, uint32_t layer_idx)

Get input name from a model by input layer index.

Parameters:

model_id – [in] Model ID
layer_idx – [in] Input layer index in the model

Returns:

Input layer name specified by index

void GetInputShape(ENNModelId model_id, uint32_t layer_idx, uint32_t *shape)

Get input shape from a model by input layer index. First position in array is dimensions number of the layer.

Parameters:

model_id – [in] Model ID
layer_idx – [in] Input layer index for the model
shape – [out] Array to save model layer shape (It should be at least [MAX_NDIM_VALUE + 1] values. Arrangement of array values is as follows: First position - number of input tensor dimensions; All the following values - dimensions of each tensor axis)

const char *GetOutputName(ENNModelId model_id, uint32_t layer_idx)

Get output name from a model by output layer index.

Parameters:

model_id – [in] Model ID
layer_idx – [in] Output layer index in the model

Returns:

Output layer name specified by index

void GetOutputShape(ENNModelId model_id, uint32_t layer_idx, uint32_t *shape)

Get output shape from a model by output layer index. First position in array is dimensions number of the layer.

Parameters:

model_id – [in] Model ID
layer_idx – [in] Output layer index for a model
shape – [out] Array to save model layer shape (It should be at least [MAX_NDIM_VALUE + 1] values. Arrangement of array values is as follows: First position - number of output tensor dimensions; All the following values - dimensions of each tensor axis)

Note

ElcoreNN uses the data parallelization paradigm. Input data distributes between DSP cores by batch dimension.

void InvokeModel(ENNModelId model_id, void **input_data, ENNDataType *input_data_type, float **output_data, uint32_t batch_size)

Runs model inference (input user pointers array, output user pointers array).

Parameters:

model_id – [in] Model ID
input_data – [in] Array of pointers for each input data arrays
input_data_type – [in] Array of data types for each input data arrays
output_data – [out] Array of pointers for each output data arrays
batch_size – [in] Batch size

The function takes model inputs as an arrays of types defined in input_data_type array and puts network’s result into a float arrays.

void InvokeModel(ENNModelId model_id, float **input_data, float **output_data, uint32_t batch_size)

Runs model inference (input pointers array, output user pointers array).

Parameters:

model_id – [in] Model ID
input_data – [in] Array of pointers for each input data arrays
output_data – [out] Array of pointers for each output data arrays
batch_size – [in] Batch size

The function takes model inputs as a float arrays and puts network’s result into a float arrays.

void InvokeModel(ENNModelId model_id, int *input_dmabuf_fd_array, float **output_data, uint32_t batch_size)

Runs model inference (input dmabuf, output user pointers array).

Parameters:

model_id – [in] Model ID
input_dmabuf_fd_array – [in] Input data as array of int’s dmabuf file descriptors
output_data – [out] Array of pointers for each output data arrays
batch_size – [in] Batch size dimension in input data

The function takes model inputs as integer arrays of dmabuf file descriptors and puts network’s: result into a float arrays. (Limitation: only one model input, input data should be float).

void InvokeModel(ENNModelId model_id, int *input_dmabuf_fd_array, int *output_dmabuf_fd_array, uint32_t batch_size)

Runs model inference (input dmabuf, output dmabuf).

Parameters:

model_id – [in] Model ID
input_dmabuf_fd_array – [in] Input data as array of int’s dmabuf file descriptors
output_dmabuf_fd_array – [out] Output data as array of int’s dmabuf file descriptors
batch_size – [in] Batch size dimension in input data

The function takes model inputs as integer arrays of dmabuf file descriptors and puts network’s: result into a dmabufs. (Limitation: only one model input, only one model output, input data should be float).

void ReleaseModel(ENNModelId model_id)

Release model.

Parameters:: model_id – [in] Model ID

void SaveModelStatisticToCSV(ENNModelId model_id, const char *file_path)

Saves model time statistic to a CSV file.

Parameters:

model_id – [in] Model ID
file_path – [in] Path to CSV file

During inference, ElcoreNN collects number of cores cycles instructions for each layer. This can be useful for performance analysis.