Add kubernetes deployment for GenAIComps #1104

yongfengdu · 2025-01-03T03:52:54Z

Description

The summary of the proposed changes as long as the relevant motivation and context.

Issues

List the issue or RFC link this PR is working on. If there is no such link, please mark it as n/a.

Type of change

List the type of change like below. Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds new functionality)
Breaking change (fix or feature that would break existing design and interface)
Others (enhancement, documentation, validation, etc.)

Dependencies

List the newly introduced 3rd party dependency if exists.

Tests

Describe the tests that you ran to verify your changes.

comps/3rd_parties/tgi/deployment/kubernetes/README.md

comps/llms/deployment/kubernetes/README.md

mkbhanda

Some questions ..

Any reason why no failureThreshold on the readinessProbe for CPU and gaudi values?

Why do we not mention any image for CPU .. is it in some common file. Likewise all the min/max token lengths.

What about model id with Gaudi?

Thank you @yongfengdu for this PR.

comps/llms/deployment/kubernetes/README.md

comps/3rd_parties/tgi/deployment/kubernetes/gaudi-values.yaml

yongfengdu · 2025-01-06T09:31:53Z

Some questions ..

Any reason why no failureThreshold on the readinessProbe for CPU and gaudi values?
These values were tuned by @eero-t and I just borrowed them.
The default failureThreshold is 3.
For livenessProbe failure, the pod will be killed/restarted by kubelet, I think that's why the failureThreshold was set to a bigger value.

Why do we not mention any image for CPU .. is it in some common file. Likewise all the min/max token lengths.
There are default values in values.yaml, which is included in the helm chart. But it might be a good idea to mention them in cpu-values.yaml and gaudi-values.yaml, even if they are the same as default values.yaml.
I'll refine the *-values.yaml file.
What about model id with Gaudi?

Thank you @yongfengdu for this PR.

yongfengdu · 2025-01-06T09:54:44Z

Here is the link for default values.
https://github.com/opea-project/GenAIInfra/blob/main/helm-charts/common/tgi/values.yaml
It's better not set duplicate VARs to reduce maintenance effort.
For example the TGI image's tag:
tag: "2.4.0-intel-cpu"
It's a value need frequently change, we won't have it in multiple places.

Some questions ..

Any reason why no failureThreshold on the readinessProbe for CPU and gaudi values?

Why do we not mention any image for CPU .. is it in some common file. Likewise all the min/max token lengths.

What about model id with Gaudi?

Thank you @yongfengdu for this PR.

eero-t · 2025-01-07T13:32:53Z

Any reason why no failureThreshold on the readinessProbe for CPU and gaudi values?

These values were tuned by @eero-t and I just borrowed them.

I was concentrating more on when things actually start working, rather than when they start failing, so it's a rather rough value.

The default failureThreshold is 3.

They could be explicitly mentioned also for readinessProbes.

For Gaudi, I think the readiness probe failureThreshold could be smaller, but I haven't tested it.

For CPU it's definitely better to keep it >1 because perf on CPU is so unpredictable (because underlying HW can differ and pods do not have fine-tuned resource requests / limits).

For livenessProbe failure, the pod will be killed/restarted by kubelet, I think that's why the failureThreshold was set to a bigger value.

I've never seen OPEA pods deadlock, especially in a way that would be solved by restarting the pod. I.e. liveness probe restarts just make things worse, as service may then never reach ready/live state => IMHO liveness probes are just harmful, and could be removed.

poussa · 2025-01-07T13:38:31Z

Why do we have kubernetes/deployment in the pathname? It should be either kubernetes or deployment, not both. Or what is the logic here?

yongfengdu · 2025-01-09T08:50:48Z

Why do we have kubernetes/deployment in the pathname? It should be either kubernetes or deployment, not both. Or what is the logic here?

It's "deployment/kubernetes"
There is also a docker_compose directory under deployment(deployment/docker_compose), providing the instructions for docker environment.
See
https://github.com/opea-project/GenAIComps/tree/main/comps/3rd_parties/tgi/deployment

Signed-off-by: Dolpher Du <[email protected]>

for more information, see https://pre-commit.ci

lianhao

for dataprep and retriever, the current value files will be obsolete in a couple of days.(new PR in infra repo will be submitted next week). Should we remove them from this PR?

yongfengdu requested a review from lvliang-intel as a code owner January 3, 2025 03:52

yongfengdu requested review from poussa, lianhao, mkbhanda and ftian1 January 3, 2025 04:15

poussa suggested changes Jan 3, 2025

View reviewed changes

comps/3rd_parties/tgi/deployment/kubernetes/README.md Outdated Show resolved Hide resolved

comps/llms/deployment/kubernetes/README.md Outdated Show resolved Hide resolved

mkbhanda reviewed Jan 3, 2025

View reviewed changes

comps/llms/deployment/kubernetes/README.md Outdated Show resolved Hide resolved

eero-t reviewed Jan 3, 2025

View reviewed changes

comps/3rd_parties/tgi/deployment/kubernetes/gaudi-values.yaml Outdated Show resolved Hide resolved

yongfengdu requested review from letonghan and XinyaoWa as code owners January 6, 2025 09:55

yongfengdu force-pushed the newhelm branch from a3d1cdc to d6fa102 Compare January 7, 2025 01:43

yongfengdu force-pushed the newhelm branch 2 times, most recently from e19ecf2 to 353bc46 Compare January 10, 2025 09:23

yongfengdu requested review from Spycsh, lkk12014402, hteeyeoh, XinyuYe-Intel, yogeshmpandey, minmin-intel and chensuyue as code owners January 10, 2025 09:23

yongfengdu changed the title ~~Add kubernetes deployment for tgi and llm~~ Add kubernetes deployment for GenAIComps Jan 10, 2025

Add kubernetes deployment for GenAIComps

75960ec

Signed-off-by: Dolpher Du <[email protected]>

yongfengdu force-pushed the newhelm branch from dcb2344 to 75960ec Compare January 10, 2025 09:37

[pre-commit.ci] auto fixes from pre-commit.com hooks

51c8667

for more information, see https://pre-commit.ci

lianhao reviewed Jan 10, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add kubernetes deployment for GenAIComps #1104

Add kubernetes deployment for GenAIComps #1104

yongfengdu commented Jan 3, 2025

mkbhanda left a comment

yongfengdu commented Jan 6, 2025

yongfengdu commented Jan 6, 2025

eero-t commented Jan 7, 2025

poussa commented Jan 7, 2025

yongfengdu commented Jan 9, 2025

lianhao left a comment

Add kubernetes deployment for GenAIComps #1104

Are you sure you want to change the base?

Add kubernetes deployment for GenAIComps #1104

Conversation

yongfengdu commented Jan 3, 2025

Description

Issues

Type of change

Dependencies

Tests

mkbhanda left a comment

Choose a reason for hiding this comment

yongfengdu commented Jan 6, 2025

yongfengdu commented Jan 6, 2025

eero-t commented Jan 7, 2025

poussa commented Jan 7, 2025

yongfengdu commented Jan 9, 2025

lianhao left a comment

Choose a reason for hiding this comment