API
The ElcoreNN API consists of the following parts:
elcorenn/elcorenn.h
: ElcoreNN CPU library API.
ElcoreNN CPU library API
elcorenn/elcorenn.h
defines the ElcoreNN CPU library API functions.
-
typedef unsigned int ENNBackendId
Backend ID.
ENNBackendId
is a ElcoreNN backend descriptor.
ElcoreNN backend is a runtime to load and invoke models.
-
typedef unsigned int ENNModelId
Model ID.
ENNModelId
is a model descriptor.
That descriptor refers to the model loaded to ElcoreNN backend.
-
enum class ENNDataType
Describes data type should be equal to ElcoreNN data types.
Values:
-
enumerator FLOAT32
-
enumerator FLOAT16
-
enumerator INT32
-
enumerator UINT32
-
enumerator UINT8
-
enumerator INT8
-
enumerator FLOAT32
ENNDataType
is a data type enumerator.
It describes the types of input data used in InvokeModel function.
It also sets type of calculations during inference.
-
enum class ENNHeapSize
Heap size allocated by DSP.
Values:
-
enumerator Size_64MB
-
enumerator Size_128MB
-
enumerator Size_256MB
-
enumerator Size_512MB
-
enumerator Size_1GB
-
enumerator Size_2GB
-
enumerator Size_3GB
-
enumerator Size_64MB
-
ENNBackendId InitBackend(ENNHeapSize dsp_heap = ENNHeapSize::Size_512MB)
Initializes internal DSPs resources. DSPs are not statically assigned and are chosen dynamically by DSP scheduler when executing models. Execution is paralleled on all available DSPs: each DSP computes its own data portion divided by batch.
- Parameters:
dsp_heap – [in] Heap size allocated by each DSP.
- Returns:
Backend ID
-
ENNBackendId InitBackend(uint32_t devices_count, ENNHeapSize dsp_heap = ENNHeapSize::Size_512MB)
Initializes internal DSPs resources. DSPs are not statically assigned and are chosen dynamically by DSP scheduler when executing models. Execution is paralleled on device_count DSPs: each DSP computes its own data portion divided by batch.
- Parameters:
devices_count – [in] Number of selected devices.
dsp_heap – [in] Heap size allocated by each DSP.
- Returns:
Backend ID
-
ENNBackendId InitBackend(uint32_t devices_count, uint32_t *devices, ENNHeapSize dsp_heap = ENNHeapSize::Size_512MB)
Initializes internal DSPs resources. Statically assign DSPs ids to execute models.
- Parameters:
devices_count – [in] Number of selected devices.
devices – [in] Indices of DSP cores.
dsp_heap – [in] Heap size allocated by each DSP.
- Returns:
Backend ID
-
void ReleaseBackend(ENNBackendId backend_id)
Release internal backend resources and all backend models.
- Parameters:
backend_id – [in] Backend ID
-
ENNModelId LoadModel(const char *model_json, const char *model_weights, ENNDataType optimization = ENNDataType::FLOAT16, ENNBackendId backend_id = 0)
Loads model from files.
- Parameters:
model_json – [in] The description of the model saved in json format
model_weights – [in] The binary file of model’s weights
optimization – [in] Data type for model optimization (default float16)
backend_id – [in] Backend used to load model
- Returns:
Model ID
The function takes paths for model description and model weights files. Optional argument optimization is used for model data type conversion. Available two data types: float32 (model loaded without changes) and float16 (model optimized for DSP inference), default value is float16. backend_id argument refers to ElcoreNN backend which is used to load model.
-
int GetInputsNumber(ENNModelId model_id)
Get number of model inputs.
- Parameters:
model_id – [in] Model ID
- Returns:
Number of model inputs
-
int GetOutputsNumber(ENNModelId model_id)
Get number of model outputs.
- Parameters:
model_id – [in] Model ID
- Returns:
Number of model outputs
-
const char *GetInputName(ENNModelId model_id, uint32_t layer_idx)
Get input name from a model by input layer index.
- Parameters:
model_id – [in] Model ID
layer_idx – [in] Input layer index in the model
- Returns:
Input layer name specified by index
-
void GetInputShape(ENNModelId model_id, uint32_t layer_idx, uint32_t *shape)
Get input shape from a model by input layer index. First position in array is dimensions number of the layer.
- Parameters:
model_id – [in] Model ID
layer_idx – [in] Input layer index for the model
shape – [out] Array to save model layer shape (It should be at least [MAX_NDIM_VALUE + 1] values. Arrangement of array values is as follows: First position - number of input tensor dimensions; All the following values - dimensions of each tensor axis)
-
const char *GetOutputName(ENNModelId model_id, uint32_t layer_idx)
Get output name from a model by output layer index.
- Parameters:
model_id – [in] Model ID
layer_idx – [in] Output layer index in the model
- Returns:
Output layer name specified by index
-
void GetOutputShape(ENNModelId model_id, uint32_t layer_idx, uint32_t *shape)
Get output shape from a model by output layer index. First position in array is dimensions number of the layer.
- Parameters:
model_id – [in] Model ID
layer_idx – [in] Output layer index for a model
shape – [out] Array to save model layer shape (It should be at least [MAX_NDIM_VALUE + 1] values. Arrangement of array values is as follows: First position - number of output tensor dimensions; All the following values - dimensions of each tensor axis)
Note
ElcoreNN uses the data parallelization paradigm. Input data distributes between DSP cores by batch dimension.
-
void InvokeModel(ENNModelId model_id, void **input_data, ENNDataType *input_data_type, float **output_data, uint32_t batch_size)
Runs model inference (input user pointers array, output user pointers array).
- Parameters:
model_id – [in] Model ID
input_data – [in] Array of pointers for each input data arrays
input_data_type – [in] Array of data types for each input data arrays
output_data – [out] Array of pointers for each output data arrays
batch_size – [in] Batch size
The function takes model inputs as an arrays of types defined in input_data_type array and puts network’s result into a float arrays.
-
void InvokeModel(ENNModelId model_id, float **input_data, float **output_data, uint32_t batch_size)
Runs model inference (input pointers array, output user pointers array).
- Parameters:
model_id – [in] Model ID
input_data – [in] Array of pointers for each input data arrays
output_data – [out] Array of pointers for each output data arrays
batch_size – [in] Batch size
The function takes model inputs as a float arrays and puts network’s result into a float arrays.
-
void InvokeModel(ENNModelId model_id, int *input_dmabuf_fd_array, float **output_data, uint32_t batch_size)
Runs model inference (input dmabuf, output user pointers array).
- Parameters:
model_id – [in] Model ID
input_dmabuf_fd_array – [in] Input data as array of int’s dmabuf file descriptors
output_data – [out] Array of pointers for each output data arrays
batch_size – [in] Batch size dimension in input data
- The function takes model inputs as integer arrays of dmabuf file descriptors and puts network’s
result into a float arrays. (Limitation: only one model input, input data should be float).
-
void InvokeModel(ENNModelId model_id, int *input_dmabuf_fd_array, int *output_dmabuf_fd_array, uint32_t batch_size)
Runs model inference (input dmabuf, output dmabuf).
- Parameters:
model_id – [in] Model ID
input_dmabuf_fd_array – [in] Input data as array of int’s dmabuf file descriptors
output_dmabuf_fd_array – [out] Output data as array of int’s dmabuf file descriptors
batch_size – [in] Batch size dimension in input data
- The function takes model inputs as integer arrays of dmabuf file descriptors and puts network’s
result into a dmabufs. (Limitation: only one model input, only one model output, input data should be float).
-
void ReleaseModel(ENNModelId model_id)
Release model.
- Parameters:
model_id – [in] Model ID
-
void SaveModelStatisticToCSV(ENNModelId model_id, const char *file_path)
Saves model time statistic to a CSV file.
- Parameters:
model_id – [in] Model ID
file_path – [in] Path to CSV file
During inference, ElcoreNN collects number of cores cycles instructions for each layer. This can be useful for performance analysis.