-
Notifications
You must be signed in to change notification settings - Fork 870
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Create nightly job for torch.compile benchmarks (#2835)
* Create nightly job for torch.compile benchmarks * Run with torch nightly * Add 300 second timeout in ab command * Retrigger tests * Add support for nightly PyTorch in auto_benchmark.py * Retrigger tests * Retrigger tests * Retrigger tests * Retrigger tests * Retrigger tests * Retrigger tests * Retrigger tests * Retrigger tests * Retrigger tests * Retrigger tests * Remove auto-validation of benchmark results from workflow * Retrigger tests * Revert changes in auto_benchmark.py and remove push trigger in workflow --------- Co-authored-by: Ubuntu <[email protected]>
- Loading branch information
Showing
7 changed files
with
237 additions
and
8 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
name: Benchmark torch.compile models nightly | ||
|
||
on: | ||
# run every day at 9:15pm | ||
schedule: | ||
- cron: '15 21 * * *' | ||
|
||
jobs: | ||
nightly: | ||
strategy: | ||
fail-fast: false | ||
runs-on: [self-hosted, gpu] | ||
timeout-minutes: 1320 | ||
steps: | ||
- name: Clean up previous run | ||
run: | | ||
echo "Cleaning up previous run" | ||
cd $RUNNER_WORKSPACE | ||
pwd | ||
cd .. | ||
pwd | ||
rm -rf _tool | ||
- name: Setup Python 3.8 | ||
uses: actions/setup-python@v4 | ||
with: | ||
python-version: 3.8 | ||
architecture: x64 | ||
- name: Setup Java 17 | ||
uses: actions/setup-java@v3 | ||
with: | ||
distribution: 'zulu' | ||
java-version: '17' | ||
- name: Checkout TorchServe | ||
uses: actions/checkout@v3 | ||
with: | ||
submodules: recursive | ||
- name: Install dependencies | ||
run: | | ||
sudo apt-get update -y | ||
sudo apt-get install -y apache2-utils | ||
pip install -r benchmarks/requirements-ab.txt | ||
- name: Benchmark gpu nightly | ||
run: python benchmarks/auto_benchmark.py --input benchmarks/benchmark_config_torch_compile_gpu.yaml --skip false --nightly True |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
# Torchserve version is to be installed. It can be one of the options | ||
# - branch : "master" | ||
# - nightly: "2022.3.16" | ||
# - release: "0.5.3" | ||
# Nightly build will be installed if "ts_version" is not specifiged | ||
#ts_version: | ||
# branch: &ts_version "master" | ||
|
||
# a list of model configure yaml files defined in benchmarks/models_config | ||
# or a list of model configure yaml files with full path | ||
models: | ||
- "bert_torch_compile_gpu.yaml" | ||
- "resnet50_torch_compile_gpu.yaml" | ||
- "vgg16_torch_compile_gpu.yaml" | ||
|
||
# benchmark on "cpu" or "gpu". | ||
# "cpu" is set if "hardware" is not specified | ||
hardware: &hardware "gpu" | ||
|
||
# load prometheus metrics report to remote storage or local different path if "metrics_cmd" is set. | ||
# the command line to load prometheus metrics report to remote system. | ||
# Here is an example of AWS cloudwatch command: | ||
# Note: | ||
# - keep the values order as the same as the command definition. | ||
# - set up the command before enabling `metrics_cmd`. | ||
# For example, aws client and AWS credentials need to be setup before trying this example. | ||
metrics_cmd: | ||
- "cmd": "aws cloudwatch put-metric-data" | ||
- "--namespace": ["torchserve_benchmark_nightly_torch_compile_", *hardware] | ||
- "--region": "us-east-2" | ||
- "--metric-data": 'file:///tmp/benchmark/logs/stats_metrics.json' | ||
|
||
# load report to remote storage or local different path if "report_cmd" is set. | ||
# the command line to load report to remote storage. | ||
# Here is an example of AWS cloudwatch command: | ||
# Note: | ||
# - keep the values order as the same as the command. | ||
# - set up the command before enabling `report_cmd`. | ||
# For example, aws client, AWS credentials and S3 bucket | ||
# need to be setup before trying this example. | ||
# - "today()" is a keyword to apply current date in the path | ||
# For example, the dest path in the following example is | ||
# s3://torchserve-model-serving/benchmark/2022-03-18/gpu | ||
report_cmd: | ||
- "cmd": "aws s3 cp --recursive" | ||
- "source": '/tmp/ts_benchmark/' | ||
- "dest": ['s3://torchserve-benchmark/torch-compile-nightly', "today()", *hardware] | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
--- | ||
bert: | ||
scripted_mode: | ||
benchmark_engine: "ab" | ||
url: https://torchserve.pytorch.org/mar_files/bert-scripted.mar | ||
workers: | ||
- 4 | ||
batch_delay: 100 | ||
batch_size: | ||
- 1 | ||
- 2 | ||
- 4 | ||
- 8 | ||
- 16 | ||
input: "./examples/Huggingface_Transformers/Seq_classification_artifacts/sample_text_captum_input.txt" | ||
requests: 50000 | ||
concurrency: 100 | ||
backend_profiling: False | ||
exec_env: "local" | ||
processors: | ||
- "cpu" | ||
- "gpus": "all" | ||
torch_compile_default_mode: | ||
benchmark_engine: "ab" | ||
url: https://torchserve.pytorch.org/mar_files/bert-default.mar | ||
workers: | ||
- 4 | ||
batch_delay: 100 | ||
batch_size: | ||
- 1 | ||
- 2 | ||
- 4 | ||
- 8 | ||
- 16 | ||
input: "./examples/Huggingface_Transformers/Seq_classification_artifacts/sample_text_captum_input.txt" | ||
requests: 50000 | ||
concurrency: 100 | ||
backend_profiling: False | ||
exec_env: "local" | ||
processors: | ||
- "cpu" | ||
- "gpus": "all" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
--- | ||
resnet50: | ||
scripted_mode: | ||
benchmark_engine: "ab" | ||
url: https://torchserve.pytorch.org/mar_files/resnet-50-scripted.mar | ||
workers: | ||
- 4 | ||
batch_delay: 100 | ||
batch_size: | ||
- 1 | ||
- 2 | ||
- 4 | ||
- 8 | ||
- 16 | ||
input: "./examples/image_classifier/kitten.jpg" | ||
requests: 10000 | ||
concurrency: 100 | ||
backend_profiling: False | ||
exec_env: "local" | ||
processors: | ||
- "cpu" | ||
- "gpus": "all" | ||
torch_compile_default_mode: | ||
benchmark_engine: "ab" | ||
url: https://torchserve.pytorch.org/mar_files/resnet-50-default.mar | ||
workers: | ||
- 4 | ||
batch_delay: 100 | ||
batch_size: | ||
- 1 | ||
- 2 | ||
- 4 | ||
- 8 | ||
- 16 | ||
input: "./examples/image_classifier/kitten.jpg" | ||
requests: 10000 | ||
concurrency: 100 | ||
backend_profiling: False | ||
exec_env: "local" | ||
processors: | ||
- "cpu" | ||
- "gpus": "all" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,42 @@ | ||
--- | ||
vgg16: | ||
scripted_mode: | ||
benchmark_engine: "ab" | ||
url: https://torchserve.pytorch.org/mar_files/vgg-16-scripted.mar | ||
workers: | ||
- 4 | ||
batch_delay: 100 | ||
batch_size: | ||
- 1 | ||
- 2 | ||
- 4 | ||
- 8 | ||
- 16 | ||
input: "./examples/image_classifier/kitten.jpg" | ||
requests: 10000 | ||
concurrency: 100 | ||
backend_profiling: False | ||
exec_env: "local" | ||
processors: | ||
- "cpu" | ||
- "gpus": "all" | ||
torch_compile_default_mode: | ||
benchmark_engine: "ab" | ||
url: https://torchserve.pytorch.org/mar_files/vgg-16-default.mar | ||
workers: | ||
- 4 | ||
batch_delay: 100 | ||
batch_size: | ||
- 1 | ||
- 2 | ||
- 4 | ||
- 8 | ||
- 16 | ||
input: "./examples/image_classifier/kitten.jpg" | ||
requests: 10000 | ||
concurrency: 100 | ||
backend_profiling: False | ||
exec_env: "local" | ||
processors: | ||
- "cpu" | ||
- "gpus": "all" |