This file explains the full format for results generated by this app.
This format is used when saving results locally on device, when uploading them to the online database, and when fetching them from the online database.
Here is an example result file: extended_result_unittest.json
Results are serialized as JSON.
Results must be a map with the following items in the root level:
meta
: mapuuid
: string
UUID is generated by the app when all benchmarks are finished.upload_date
: string
Datetime of the moment when web service received the upload request Format is Iso 8601 in UTC timezone:2022-04-14T03:54:54.687Z
results
: list of maps. See List of benchmark-specific results sectionenvironment_info
: map. See Environment info sectionbuild_info
: map. See Build info section
Each benchmark generates one map with results.
For example, if you select 3 benchmarks, you will get a list with 3 items, each describing specific benchmark.
Almost all of the settings are the same for performance and accuracy runs so they are united.
If Submission mode is disabled, accuracy_run
will be null for all results.
If you enable Submission mode, both performance_run
and accuracy_run
values will be filled.
benchmark_id
: stringbenchmark_name
: string
Value fromtask.model.name
for this benchmark from selected tasks.pbtxt file.loadgen_scenario
: string enum
See::mlperf::TestScenario
.
Allowed values:SingleStream
Offline
backend_settings
: map
Settings defined by selected backend for this benchmark.accelerator_code
: stringaccelerator_desc
: stringframework
: stringdelegate
: stringmodel_path
: stringbatch_size
: integer numberextra_settings
: list of maps
Extra settings that can vary between different benchmarks and backends.
Here must be stored values set by backend incommon_setting
.
shards_num
value for TFLite backend should be located here.
Map structure:id
: string. Value fromsetting.id
that is passed to backendname
: string. Value fromsetting.name
that is passed to backendvalue
: string. Value fromsetting.value.value
that is passed to backendvalue_name
: string. Value fromsetting.value.name
that is passed to backend
performance_run
: map
May be null if performance was not tested in this benchmark.throughput
: map
May be null for an accuracy run.value
: floating point number
Throughput value for this run of the benchmark.
accuracy
: map
May be null for a performance run if groundtruth file is not provided.normalized
: floating point number
Accuracy value for this run of the benchmark. Value must be normalized between0.0
and1.0
.formatted
: string
Formatted accuracy string, often with measuring unit suffix
measured_duration
: floating point number
Actual duration of the benchmark in seconds from start to finish.measured_samples
: integer number
Actual number of samples evaluated during the benchmarkloadgen
: map
Info provided by loadgen. May be null for accuracy runs.queryCount
: bool
Number of queries performed.latencyMean
: bool
Mean latency in seconds.latency90
: bool
90th percentile in seconds.isMinDurationMet
: bool Indicates whether the min duration condition is met or not.isMinQueryMet
: bool Indicates whether the min query condition is met or not.isEarlyStoppingMet
: bool Indicates whether the early stopping condition is met or not.isResultValid
: bool
Indicates whether the result is valid or not.
start_datetime
: string
Datetime of the moment when benchmark started
Format is Iso 8601 in UTC timezone:2022-04-14T03:54:54.687Z
dataset
: map
Dataset info for this benchmark from selectedtasks.pbtxt
file.name
: stringtype
: string enum
Allowed values (this list may be extended when we add support for more datasets):IMAGENET
COCO
ADE20K
SQUAD
COCOGEN
data_path
: stringgroundtruth_path
: string
accuracy_run
: map
Same asperformance_run
. May be null if accuracy was not tested in this benchmark.min_duration
: floating point number
Value fromtask.min_duration
for this benchmark from selected tasks.pbtxt file.max_duration
: floating point number
Value fromtask.max_duration
for this benchmark from selected tasks.pbtxt file.min_samples
: integer number
Value fromtask.min_query_count
for this benchmark from selected tasks.pbtxt file.backend_info
: mapfilename
: string
Actual filename of the backendbackend_name
: string
Backend name reported by backendvendor_name
: string
Vendor name reported by backendaccelerator_name
: string
Backend-defined string describing actual accelerator used during this benchmark.
Should typically matchaccelerator_desc
from thebackend_settings
map but may be different in case of accelerator fallback.
Info about environment the app is running in. May change when you update your OS, change device hardware, or use another device.
platform
: string
Used to determine whichinfo
entry should be used.
Currently device type simply maps to supported OS list but this may change in the future.
Allowed values:android
ios
windows
value
: map
Info about device software and underlying hardware. Should contain exactly one valid field, according to device type.android
: map
Must be null if device type is not Android.os_version
: string
Must be obtained from environment.manufacturer
: string. Manufacturer of the device Value ofmanufacturer
from Android build constantsmodel_code
: string. Manufacturer-defined model code Value ofmodel
from Android build constants
For example:SM-G981U1
.model_name
: string. Human-readable model name
Marketing name that corresponds tomodel_code
.
For example:Galaxy S20 5G
.board_code
: string Value ofboard
from Android build constantsproc_cpuinfo_soc_name
: string
SoC name obtained from/proc/cpuinfo
file.
This field may serve as a backup option if SoC name can't be determined via other methods.
May be null.props
: array of maps
Contains data obtained viagetprop
Android util. May be extended by adding vendor-specific properties.type
: string
Type of data entry. Possible values:soc_manufacturer
,soc_model
. The list may be extended in the future.name
: string
Name of the property. Main purpose is to gather statistics which values can actually be seen on a phone. Currently known values:ro.soc.model
ro.soc.manufacturer
value
: string
Result ofgetprop <name>
ios
: map
Must be null is device type is not iOS.os_version
: string
Must be obtained from environment.model_code
: string. Manufacturer-defined model code
Apple Machine name. For example:iPhone14,5
.model_name
: string. Human-readable model name
Marketing name that corresponds tomodel_code
. For example:iPhone 13
.soc_name
: string
Full SoC name.
windows
: map
Must be null is device type is not Windows.os_version
: string
Must be obtained from environment.cpu_full_name
: string
Should contain CPU name as reported by CPU.
Constant info for this build of the app. The only way to change it is to use a different version of the app.
version
: stringbuild_number
: stringofficial_release_flag
: bool
Indicates if the official release flag was set for this builddev_test_flag
: bool
Indicated if development test flag was set for this buildbackend_list
: list of strings
Must contain actual list of backends that are included into this version of the app.git_branch
: stringgit_commit
: stringgit_dirty_flag
: bool
Indicates if there are any local changes compared to the commit specified ingit_commit
.