API

The ElcoreNN API consists of the following parts:

  • elcorenn/elcorenn.h: ElcoreNN CPU library API.

ElcoreNN CPU library API

elcorenn/elcorenn.h defines the ElcoreNN CPU library API functions.

typedef unsigned int ENNModelId

Model ID.

ENNModelId is a model descriptor. That descriptor refers to the model loaded into DSP (Elcore50) memory. Use LoadModel function to easily load model into DSP memory.

enum class ENNDataType

Describes data type should be equal to ElcoreNN data types.

Values:

enumerator FLOAT32
enumerator FLOAT16
enumerator INT32
enumerator UINT32
enumerator UINT8
enumerator INT8

ENNDataType is a data type enumerator. It describes the types of input data used in InvokeModel function. It also sets type of calculations during inference.

enum class ENNHeapSize

Heap size allocated by DSP.

Values:

enumerator Size_64MB
enumerator Size_128MB
enumerator Size_256MB
enumerator Size_512MB
enumerator Size_1GB
enumerator Size_2GB
enumerator Size_3GB
void InitBackend(ENNHeapSize dsp_heap = ENNHeapSize::Size_512MB)

Initializes internal DSPs resources. Use all cores.

Parameters:

dsp_heap[in] Heap size allocated by each DSP

void InitBackend(uint32_t devices_count, uint32_t *devices, ENNHeapSize dsp_heap = ENNHeapSize::Size_512MB)

Initializes internal DSPs resources.

Parameters:
  • devices_count[in] The number of selected devices

  • devices[in] The indices of DSP cores to use for prediction, from 0 to DSP cores number (not included)

  • dsp_heap[in] Heap size allocated by each DSP

void ReleaseDevice()

Releases internal DSPs resources.

ENNModelId LoadModel(const char *model_json, const char *model_weights, ENNDataType optimization = ENNDataType::FLOAT16)

Loads model from files.

Parameters:
  • model_json[in] The description of the model saved in json format

  • model_weights[in] The binary file of model’s weights

  • optimization[in] Data type for model optimization (default float16)

Returns:

Model ID

The function takes paths for model description and model weights files. Optional parameter optimization is used for model data type conversion. Available two data types: float32 (model loaded without changes) and float16 (model optimized for DSP inference), default value is float16.

int GetInputsNumber(ENNModelId model_id)

Get number of model inputs.

Parameters:

model_id[in] Model ID

Returns:

Number of model inputs

int GetOutputsNumber(ENNModelId model_id)

Get number of model outputs.

Parameters:

model_id[in] Model ID

Returns:

Number of model outputs

const char *GetInputName(ENNModelId model_id, uint32_t layer_idx)

Get input name from a model by input layer index.

Parameters:
  • model_id[in] Model ID

  • layer_idx[in] Input layer index in the model

Returns:

Input layer name specified by index

void GetInputShape(ENNModelId model_id, uint32_t layer_idx, uint32_t *shape)

Get input shape from a model by input layer index. First position in array is dimensions number of the layer.

Parameters:
  • model_id[in] Model ID

  • layer_idx[in] Input layer index for the model

  • shape[out] Array to save model layer shape (It should be at least [MAX_NDIM_VALUE + 1] values. Arrangement of array values is as follows: First position - number of input tensor dimensions; All the following values - dimensions of each tensor axis)

const char *GetOutputName(ENNModelId model_id, uint32_t layer_idx)

Get output name from a model by output layer index.

Parameters:
  • model_id[in] Model ID

  • layer_idx[in] Output layer index in the model

Returns:

Output layer name specified by index

void GetOutputShape(ENNModelId model_id, uint32_t layer_idx, uint32_t *shape)

Get output shape from a model by output layer index. First position in array is dimensions number of the layer.

Parameters:
  • model_id[in] Model ID

  • layer_idx[in] Output layer index for a model

  • shape[out] Array to save model layer shape (It should be at least [MAX_NDIM_VALUE + 1] values. Arrangement of array values is as follows: First position - number of output tensor dimensions; All the following values - dimensions of each tensor axis)

void InvokeModel(ENNModelId model_id, void **input_data, ENNDataType *input_data_type, float **output_data, uint32_t batch_size)

Runs model inference (input user pointers array, output user pointes array).

Parameters:
  • model_id[in] Model ID

  • input_data[in] Array of pointers for each input data arrays

  • input_data_type[in] Array of data types for each input data arrays

  • output_data[out] Array of pointers for each output data arrays

  • batch_size[in] Batch size

The function takes model inputs as an arrays of types defined in input_data_type array and puts network’s result into a float arrays.

void InvokeModel(ENNModelId model_id, float **input_data, float **output_data, uint32_t batch_size)

Runs model inference (input pointers array, output user pointers array).

Parameters:
  • model_id[in] Model ID

  • input_data[in] Array of pointers for each input data arrays

  • output_data[out] Array of pointers for each output data arrays

  • batch_size[in] Batch size

The function takes model inputs as a float arrays and puts network’s result into a float arrays.

void InvokeModel(ENNModelId model_id, int *input_dmabuf_fd_array, float **output_data, uint32_t batch_size)

Runs model inference (input dmabuf, output user pointers array).

Parameters:
  • model_id[in] Model ID

  • input_dmabuf_fd_array[in] Input data as array of int’s dmabuf file descriptors

  • output_data[out] Array of pointers for each output data arrays

  • batch_size[in] Batch size dimension in input data

The function takes model inputs as integer arrays of dmabuf file descriptors and puts network’s

result into a float arrays. (Limitation: only one model input, input data should be float).

void InvokeModel(ENNModelId model_id, int *input_dmabuf_fd_array, int *output_dmabuf_fd_array, uint32_t batch_size)

Runs model inference (input dmabuf, output dmabuf).

Parameters:
  • model_id[in] Model ID

  • input_dmabuf_fd_array[in] Input data as array of int’s dmabuf file descriptors

  • output_dmabuf_fd_array[out] Output data as array of int’s dmabuf file descriptors

  • batch_size[in] Batch size dimension in input data

The function takes model inputs as integer arrays of dmabuf file descriptors and puts network’s

result into a dmabufs. (Limitation: only one model input, only one model output, input data should be float).

Note

ElcoreNN uses the data parallelization paradigm. Input data distributes between DSP cores by batch dimension.

void SaveModelStatisticToCSV(ENNModelId model_id, const char *file_path)

Saves model time statistic to a CSV file.

Parameters:
  • model_id[in] Model ID

  • file_path[in] Path to CSV file

During inference, ElcoreNN collects number of cores cycles instructions for each layer. This can be useful for performance analysis.