-
Notifications
You must be signed in to change notification settings - Fork 317
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Profiler result is not consistent with each run #794
Comments
Hello. To answer your last question, event timing is using system clock time. Have a look at armnn/src/armnn/WallClockTimer.hpp for details on how its done. Ultimately, it goes back to clock_gettime in /usr/include/time.h. In general if you are looking at overall inference execution times you should run multiple inferences and watch the trend. Depending on the type of model there may be some operations executed on the first inference that will be cached for subsequent inferences of a loaded model. Colm. |
Hello, In general if you are looking at overall inference execution times you should run multiple inferences and watch the trend. Depending on the type of model there may be some operations executed on the first inference that will be cached for subsequent inferences of a loaded model. To elaborate more there are 2 problems i am facing with respect to profiling.
// Enable profiling.
} |
Hello,
i am using below code part to capture the runtime for model inference
// Import the TensorFlow model. Note: use CreateNetworkFromBinaryFile for .pb files.
armnnTfLiteParser::ITfLiteParserPtr parser = armnnTfLiteParser::ITfLiteParser::Create();
// Create ArmNN runtime
armnn::IRuntime::CreationOptions options; // default options
armnn::IRuntimePtr runtime = armnn::IRuntime::Create(options);
// Create a profiler and register it for the current thread.
std::shared_ptrarmnn::IProfiler profiler = runtime->GetProfiler(networkIdentifier);
profiler->EnableProfiling(true);
// Enable profiling.
profiler->EnableProfiling(true);
// Run Inference
armnn::InputTensors inputTensor = MakeInputTensors(inputBindingInfo, &input[0]);
armnn::OutputTensors outputTensor = MakeOutputTensors(outputBindingInfo, &output[0]);
armnn::Status ret = runtime->EnqueueWorkload(networkIdentifier, inputTensor, outputTensor);
// Print output
profiler->Print(std::cout);
able to see json format of each layer profiler result .
Problem: Running .tflite model on arm a78 core with CpuAcc as the option the runtime is different for each run of same model.
for one of model it is varying from 0.8 to 1.2ms
Need to know how runtime is calculating using system clock or by using arm registers
The text was updated successfully, but these errors were encountered: