-
Notifications
You must be signed in to change notification settings - Fork 31
Description
I am currently investigating the time breakdown of tasks within a single model. However, I've encountered an issue where I cannot profile the DMATask. I've spent a week trying to identify any missing information but have yet to resolve the problem. Below are the minimal steps to reproduce the issue. The platform is an Intel ultra258v laptop with openvino-nightly
and numpy
installed.
Step 1: Create the Sigmoid
Model
import numpy as np
import openvino as ov
from openvino.runtime import op, opset1
n_len = 128
ov_type = ov.Type.f16
A = op.Parameter(ov_type, ov.Shape([n_len]))
C = opset1.sigmoid(A)
model = ov.Model(C, [A])
ov.save_model(model, "sigmoid.xml")
Step 2: Compile the Sigmoid
Model with DMA Profiling Flags
vpux-translate --vpu-arch=NPU40XX \
--vpux-profiling \
--mlir-print-debuginfo \
--import-IE sigmoid.xml -o sigmoid.mlir
vpux-opt --vpu-arch=NPU40XX \
--default-hw-mode="profiling=true dma-profiling=true" \
--lower-VPUIP-to-ELF sigmoid.mlir \
-o sigmoid_out.mlir
vpux-translate --vpu-arch=NPU40XX --export-ELF sigmoid_out.mlir -o sigmoid.blob
Step 3: Run the Model
import os
os.environ['ZE_INTEL_NPU_LOGLEVEL'] = 'ERROR'
import openvino as ov
import numpy as np
core = ov.Core()
core.set_property('NPU', {
'PERF_COUNT': True,
})
with open('sigmoid.blob', 'rb') as f:
blob = f.read()
model = core.import_model(blob, device_name='NPU')
req = model.create_infer_request()
req.infer(np.random.random(128).astype(np.float16))
prof_info = req.profiling_info[0]
print('status', prof_info.status)
print('real_time', prof_info.real_time)
print('cpu_time', prof_info.cpu_time)
print('node_name', prof_info.node_name)
print('exec_type', prof_info.exec_type)
print('node_type', prof_info.node_type)
You will encounter the following error:
NPU_LOG: ERROR [compiler.cpp:289] Failed to get decoded profiling data in compiler
In fact, when I try using the Level Zero API to submit tasks directly and then use the Level Zero NPU extension API to get profiling results, I find that the DMA profiling output data is zero, while the DPU and SW-Kernel profiling outputs are valid. I am wondering how I can obtain the DMA task profiling results.