-
Notifications
You must be signed in to change notification settings - Fork 870
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
AOTInductor BERT CPP example (#2931)
* fix compile error on mac x86 * update install libtorch * fmt * fmt * fmt * Set return type of bert model and dynamic shapes * fix json value * fix build on linux * add linux dependency * replace sentenepice with tokenizers-cpp * update dependency * add attention mask * fix compile error * fix compile error * fmt * Fmt * tockenizer-cpp git submodule * update handler * fmt * fmt * fmt * unset env * fix path * Fix type error in bert aot example * fmt * fmt * update max setting * fix lint * add limitation * pinned folly to v2024.02.19.00 * pinned yam-cpp with tags/0.8.0 * pinned yaml-cpp 0.8.0 * update build.sh * pinned yaml-cpp v0.8.0 * fmt * fix typo * add submodule kineto * fmt * fix workflow * fix workflow * fix ubuntu version * update readme --------- Co-authored-by: Matthias Reso <[email protected]>
- Loading branch information
Showing
28 changed files
with
602 additions
and
53 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
11 changes: 11 additions & 0 deletions
11
cpp/test/resources/examples/aot_inductor/bert_handler/MAR-INF/MANIFEST.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
{ | ||
"createdOn": "12/02/2024 21:09:26", | ||
"runtime": "LSP", | ||
"model": { | ||
"modelName": "bertcppaot", | ||
"handler": "libbert_handler:BertCppHandler", | ||
"modelVersion": "1.0", | ||
"configFile": "model-config.yaml" | ||
}, | ||
"archiverVersion": "0.9.0" | ||
} |
4 changes: 4 additions & 0 deletions
4
cpp/test/resources/examples/aot_inductor/bert_handler/index_to_name.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
{ | ||
"0":"Not Accepted", | ||
"1":"Accepted" | ||
} |
13 changes: 13 additions & 0 deletions
13
cpp/test/resources/examples/aot_inductor/bert_handler/model-config.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
minWorkers: 1 | ||
maxWorkers: 1 | ||
batchSize: 2 | ||
|
||
handler: | ||
model_so_path: "bert-seq.so" | ||
tokenizer_path: "tokenizer.json" | ||
mapping: "index_to_name.json" | ||
model_name: "bert-base-uncased" | ||
mode: "sequence_classification" | ||
do_lower_case: true | ||
num_labels: 2 | ||
max_length: 150 |
1 change: 1 addition & 0 deletions
1
cpp/test/resources/examples/aot_inductor/bert_handler/sample_text.txt
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Bloomberg has decided to publish a new report on the global economy. |
Submodule tokenizers-cpp
added at
27dbe1
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
set(TOKENZIER_CPP_PATH ${CMAKE_CURRENT_SOURCE_DIR}/../../../../cpp/third-party/tokenizers-cpp) | ||
add_subdirectory(${TOKENZIER_CPP_PATH} tokenizers EXCLUDE_FROM_ALL) | ||
add_library(bert_handler SHARED src/bert_handler.cc) | ||
target_include_directories(bert_handler PRIVATE ${TOKENZIER_CPP_PATH}/include) | ||
target_link_libraries(bert_handler PRIVATE ts_backends_core ts_utils ${TORCH_LIBRARIES} tokenizers_cpp) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,61 @@ | ||
This example uses AOTInductor to compile the [google-bert/bert-base-uncased](https://huggingface.co/google-bert/bert-base-uncased) into an so file (see script [aot_compile_export.py](aot_compile_export.py)). In PyTorch 2.2, the supported `MAX_SEQ_LENGTH` in this script is 511. | ||
|
||
Then, this example loads model and runs prediction using libtorch. The handler C++ source code for this examples can be found [here](src). | ||
|
||
### Setup | ||
1. Follow the instructions in [README.md](../../../../cpp/README.md) to build the TorchServe C++ backend. | ||
|
||
``` | ||
cd serve/cpp | ||
./builld.sh | ||
``` | ||
|
||
The build script will create the necessary artifact for this example. | ||
To recreate these by hand you can follow the prepare_test_files function of the [build.sh](../../../../cpp/build.sh) script. | ||
We will need the handler .so file as well as the bert-seq.so and tokenizer.json. | ||
|
||
2. Create a [model-config.yaml](model-config.yaml) | ||
|
||
```yaml | ||
minWorkers: 1 | ||
maxWorkers: 1 | ||
batchSize: 2 | ||
|
||
handler: | ||
model_so_path: "bert-seq.so" | ||
tokenizer_path: "tokenizer.json" | ||
mapping: "index_to_name.json" | ||
model_name: "bert-base-uncased" | ||
mode: "sequence_classification" | ||
do_lower_case: true | ||
num_labels: 2 | ||
max_length: 150 | ||
``` | ||
### Generate Model Artifact Folder | ||
```bash | ||
torch-model-archiver --model-name bertcppaot --version 1.0 --handler ../../../../cpp/_build/test/resources/examples/aot_inductor/bert_handler/libbert_handler:BertCppHandler --runtime LSP --extra-files index_to_name.json,../../../../cpp/_build/test/resources/examples/aot_inductor/bert_handler/bert-seq.so,../../../../cpp/_build/test/resources/examples/aot_inductor/bert_handler/tokenizer.json --config-file model-config.yaml --archive-format no-archive | ||
``` | ||
|
||
Create model store directory and move the folder `bertcppaot` | ||
|
||
``` | ||
mkdir model_store | ||
mv bertcppaot model_store/ | ||
``` | ||
|
||
### Inference | ||
|
||
Start torchserve using the following command | ||
|
||
``` | ||
torchserve --ncs --model-store model_store/ --models bertcppaot | ||
``` | ||
|
||
Infer the model using the following command | ||
|
||
``` | ||
curl http://localhost:8080/predictions/bertcppaot -T ../../../../cpp/test/resources/examples/aot_inductor/bert_handler/sample_text.txt | ||
Not Accepted | ||
``` |
Oops, something went wrong.