API

The ElcoreNN API consists of the following parts:

  • elcorenn/elcorenn.h: ElcoreNN CPU library API.

ElcoreNN CPU library API

elcorenn/elcorenn.h defines the ElcoreNN CPU library API functions.

typedef unsigned int ENNBackendId

Backend ID.

ENNBackendId is a ElcoreNN backend descriptor. ElcoreNN backend is a runtime to load and invoke models.

typedef unsigned int ENNModelId

Model ID.

ENNModelId is a model descriptor. That descriptor refers to the model loaded to ElcoreNN backend.

enum class ENNDataType

Describes data type should be equal to ElcoreNN data types.

Values:

enumerator FLOAT32
enumerator FLOAT16
enumerator INT32
enumerator UINT32
enumerator UINT8
enumerator INT8

ENNDataType is a data type enumerator. It describes the types of input data used in InvokeModel function. It also sets type of calculations during inference.

enum class ENNHeapSize

Heap size allocated by DSP.

Values:

enumerator Size_64MB
enumerator Size_128MB
enumerator Size_256MB
enumerator Size_512MB
enumerator Size_1GB
enumerator Size_2GB
enumerator Size_3GB
ENNBackendId InitBackend(ENNHeapSize dsp_heap = ENNHeapSize::Size_512MB)

Initializes internal DSPs resources. DSPs are not statically assigned and are chosen dynamically by DSP scheduler when executing models. Execution is paralleled on all available DSPs: each DSP computes its own data portion divided by batch.

Parameters:

dsp_heap[in] Heap size allocated by each DSP.

Returns:

Backend ID

ENNBackendId InitBackend(uint32_t devices_count, ENNHeapSize dsp_heap = ENNHeapSize::Size_512MB)

Initializes internal DSPs resources. DSPs are not statically assigned and are chosen dynamically by DSP scheduler when executing models. Execution is paralleled on device_count DSPs: each DSP computes its own data portion divided by batch.

Parameters:
  • devices_count[in] Number of selected devices.

  • dsp_heap[in] Heap size allocated by each DSP.

Returns:

Backend ID

ENNBackendId InitBackend(uint32_t devices_count, uint32_t *devices, ENNHeapSize dsp_heap = ENNHeapSize::Size_512MB)

Initializes internal DSPs resources. Statically assign DSPs ids to execute models.

Parameters:
  • devices_count[in] Number of selected devices.

  • devices[in] Indices of DSP cores.

  • dsp_heap[in] Heap size allocated by each DSP.

Returns:

Backend ID

void ReleaseBackend(ENNBackendId backend_id)

Release internal backend resources and all backend models.

Parameters:

backend_id[in] Backend ID

ENNModelId LoadModel(const char *model_json, const char *model_weights, ENNDataType optimization = ENNDataType::FLOAT16, ENNBackendId backend_id = 0)

Loads model from files.

Parameters:
  • model_json[in] The description of the model saved in json format

  • model_weights[in] The binary file of model’s weights

  • optimization[in] Data type for model optimization (default float16)

  • backend_id[in] Backend used to load model

Returns:

Model ID

The function takes paths for model description and model weights files. Optional argument optimization is used for model data type conversion. Available two data types: float32 (model loaded without changes) and float16 (model optimized for DSP inference), default value is float16. backend_id argument refers to ElcoreNN backend which is used to load model.

int GetInputsNumber(ENNModelId model_id)

Get number of model inputs.

Parameters:

model_id[in] Model ID

Returns:

Number of model inputs

int GetOutputsNumber(ENNModelId model_id)

Get number of model outputs.

Parameters:

model_id[in] Model ID

Returns:

Number of model outputs

const char *GetInputName(ENNModelId model_id, uint32_t layer_idx)

Get input name from a model by input layer index.

Parameters:
  • model_id[in] Model ID

  • layer_idx[in] Input layer index in the model

Returns:

Input layer name specified by index

void GetInputShape(ENNModelId model_id, uint32_t layer_idx, uint32_t *shape)

Get input shape from a model by input layer index. First position in array is dimensions number of the layer.

Parameters:
  • model_id[in] Model ID

  • layer_idx[in] Input layer index for the model

  • shape[out] Array to save model layer shape (It should be at least [MAX_NDIM_VALUE + 1] values. Arrangement of array values is as follows: First position - number of input tensor dimensions; All the following values - dimensions of each tensor axis)

const char *GetOutputName(ENNModelId model_id, uint32_t layer_idx)

Get output name from a model by output layer index.

Parameters:
  • model_id[in] Model ID

  • layer_idx[in] Output layer index in the model

Returns:

Output layer name specified by index

void GetOutputShape(ENNModelId model_id, uint32_t layer_idx, uint32_t *shape)

Get output shape from a model by output layer index. First position in array is dimensions number of the layer.

Parameters:
  • model_id[in] Model ID

  • layer_idx[in] Output layer index for a model

  • shape[out] Array to save model layer shape (It should be at least [MAX_NDIM_VALUE + 1] values. Arrangement of array values is as follows: First position - number of output tensor dimensions; All the following values - dimensions of each tensor axis)

Note

ElcoreNN uses the data parallelization paradigm. Input data distributes between DSP cores by batch dimension.

void InvokeModel(ENNModelId model_id, void **input_data, ENNDataType *input_data_type, float **output_data, uint32_t batch_size)

Runs model inference (input user pointers array, output user pointers array).

Parameters:
  • model_id[in] Model ID

  • input_data[in] Array of pointers for each input data arrays

  • input_data_type[in] Array of data types for each input data arrays

  • output_data[out] Array of pointers for each output data arrays

  • batch_size[in] Batch size

The function takes model inputs as an arrays of types defined in input_data_type array and puts network’s result into a float arrays.

void InvokeModel(ENNModelId model_id, float **input_data, float **output_data, uint32_t batch_size)

Runs model inference (input pointers array, output user pointers array).

Parameters:
  • model_id[in] Model ID

  • input_data[in] Array of pointers for each input data arrays

  • output_data[out] Array of pointers for each output data arrays

  • batch_size[in] Batch size

The function takes model inputs as a float arrays and puts network’s result into a float arrays.

void InvokeModel(ENNModelId model_id, int *input_dmabuf_fd_array, float **output_data, uint32_t batch_size)

Runs model inference (input dmabuf, output user pointers array).

Parameters:
  • model_id[in] Model ID

  • input_dmabuf_fd_array[in] Input data as array of int’s dmabuf file descriptors

  • output_data[out] Array of pointers for each output data arrays

  • batch_size[in] Batch size dimension in input data

The function takes model inputs as integer arrays of dmabuf file descriptors and puts network’s

result into a float arrays. (Limitation: only one model input, input data should be float).

void InvokeModel(ENNModelId model_id, int *input_dmabuf_fd_array, int *output_dmabuf_fd_array, uint32_t batch_size)

Runs model inference (input dmabuf, output dmabuf).

Parameters:
  • model_id[in] Model ID

  • input_dmabuf_fd_array[in] Input data as array of int’s dmabuf file descriptors

  • output_dmabuf_fd_array[out] Output data as array of int’s dmabuf file descriptors

  • batch_size[in] Batch size dimension in input data

The function takes model inputs as integer arrays of dmabuf file descriptors and puts network’s

result into a dmabufs. (Limitation: only one model input, only one model output, input data should be float).

void ReleaseModel(ENNModelId model_id)

Release model.

Parameters:

model_id[in] Model ID

void SaveModelStatisticToCSV(ENNModelId model_id, const char *file_path)

Saves model time statistic to a CSV file.

Parameters:
  • model_id[in] Model ID

  • file_path[in] Path to CSV file

During inference, ElcoreNN collects number of cores cycles instructions for each layer. This can be useful for performance analysis.