Make config.json consistent between shortfin and sharktank #487

renxida · 2024-11-12T17:41:28Z

And remove the adaption layer in buidl_tools/integration_tests/llm/conftest.py

stbaione · 2024-11-12T20:57:49Z

sharktank/sharktank/examples/export_paged_llm_v1.py

        return {
            "module_name": "module",
            "module_abi_version": 1,
            "max_seq_len": hp.context_length,
-            "attn_head_count": hp.attention_head_count,
+            # "attn_head_count": hp.attention_head_count, # we don't need the attention head count we just need the kvcache attention head count for shortfin


Since all of the docs are updated and references to "attn_head_count" are removed, this is probably fine to delete. Otherwise, it'll sit here lingering for who knows how long

stbaione

Looks good! We can probably remove the commented out line for attn_head_count in export_paged_llm_v1.py

Added types and docstrings definitely clears things up

app_tests/benchmark_tests/llm/sglang_benchmarks/conftest.py

…ructure change

#676) # Description Did a pass through and made updates + fixes to the user docs for `e2e_llama8b_mi300x.md`. 1. Update install instructions for `shark-ai` 2. Update nightly install instructions for `shortfin` and `sharktank` 3. Update paths for model artifacts to ensure they work with `llama3.1-8b-fp16-instruct` 4. Remove steps to `write edited config`. No longer needed after #487 Added back `sentencepiece` as a requirement for `sharktank`. Not having it caused `export_paged_llm_v1` to break when installing nightly: ```text ModuleNotFoundError: No module named 'sentencepiece' ``` This was obfuscated when building from source, because `shortfin` includes `sentencepiece` in `requirements-tests.txt`.

And remove the adaption layer in buidl_tools/integration_tests/llm/conftest.py

nod-ai#676) # Description Did a pass through and made updates + fixes to the user docs for `e2e_llama8b_mi300x.md`. 1. Update install instructions for `shark-ai` 2. Update nightly install instructions for `shortfin` and `sharktank` 3. Update paths for model artifacts to ensure they work with `llama3.1-8b-fp16-instruct` 4. Remove steps to `write edited config`. No longer needed after nod-ai#487 Added back `sentencepiece` as a requirement for `sharktank`. Not having it caused `export_paged_llm_v1` to break when installing nightly: ```text ModuleNotFoundError: No module named 'sentencepiece' ``` This was obfuscated when building from source, because `shortfin` includes `sentencepiece` in `requirements-tests.txt`.

And remove the adaption layer in buidl_tools/integration_tests/llm/conftest.py

#676) # Description Did a pass through and made updates + fixes to the user docs for `e2e_llama8b_mi300x.md`. 1. Update install instructions for `shark-ai` 2. Update nightly install instructions for `shortfin` and `sharktank` 3. Update paths for model artifacts to ensure they work with `llama3.1-8b-fp16-instruct` 4. Remove steps to `write edited config`. No longer needed after #487 Added back `sentencepiece` as a requirement for `sharktank`. Not having it caused `export_paged_llm_v1` to break when installing nightly: ```text ModuleNotFoundError: No module named 'sentencepiece' ``` This was obfuscated when building from source, because `shortfin` includes `sentencepiece` in `requirements-tests.txt`.

stbaione reviewed Nov 12, 2024

View reviewed changes

stbaione approved these changes Nov 12, 2024

View reviewed changes

renxida mentioned this pull request Nov 19, 2024

[tracking] Production Grade Shortfin-LLM #245

Open

11 tasks

renxida force-pushed the config-consistency branch from c7b1d3b to 4a1357d Compare December 10, 2024 18:48

renxida marked this pull request as ready for review December 10, 2024 18:53

stbaione reviewed Dec 10, 2024

View reviewed changes

app_tests/benchmark_tests/llm/sglang_benchmarks/conftest.py Outdated Show resolved Hide resolved

renxida enabled auto-merge (squash) December 11, 2024 08:45

renxida and others added 8 commits December 11, 2024 03:45

the configs should be consistent now

2f112f1

remove attention_head_count (superceded by attention_head_count_kv)

2b2b5d8

add a type hint

f780e0c

fix config writing

1e5b2e9

add a docstring and remove a commented out line

0a33c0f

eliminate config writing

6dcd5f4

new write_config to only update instead of completely overwrite config

4526c44

undo accidental rename due to copy-paste and correct missed config st…

f7490f6

…ructure change

renxida force-pushed the config-consistency branch from 2d99109 to f7490f6 Compare December 11, 2024 08:45

renxida merged commit 63edf36 into nod-ai:main Dec 11, 2024
19 of 20 checks passed

stbaione mentioned this pull request Dec 11, 2024

Update user docs for running llm server + upgrade gguf to 0.11.0 #676

Merged

renxida mentioned this pull request Dec 12, 2024

Fix cache config inconsistency between shortfin and sharktank #406

Closed

IanNod pushed a commit to IanNod/SHARK-Platform that referenced this pull request Dec 17, 2024

Make config.json consistent between shortfin and sharktank (nod-ai#487)

7f5af13

And remove the adaption layer in buidl_tools/integration_tests/llm/conftest.py

monorimet pushed a commit that referenced this pull request Jan 8, 2025

Make config.json consistent between shortfin and sharktank (#487)

7e99848

And remove the adaption layer in buidl_tools/integration_tests/llm/conftest.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make config.json consistent between shortfin and sharktank #487

Make config.json consistent between shortfin and sharktank #487

renxida commented Nov 12, 2024

stbaione Nov 12, 2024

stbaione left a comment

Make config.json consistent between shortfin and sharktank #487

Make config.json consistent between shortfin and sharktank #487

Conversation

renxida commented Nov 12, 2024

stbaione Nov 12, 2024

Choose a reason for hiding this comment

stbaione left a comment

Choose a reason for hiding this comment