Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

When deploying a Guaranteed Pod in k8s, Emissary Ingress fails to start. #5812

Open
hushuai1559 opened this issue Dec 20, 2024 · 3 comments
Open
Labels
t:bug Something isn't working

Comments

@hushuai1559
Copy link

hushuai1559 commented Dec 20, 2024

My Kubernetes version is 1.28.9, and I have set the --cpu-manager-policy=static flag for the kubelet during startup., I tried using emissary-ingress versions from 2.2.2 to 3.9.1, but encountered the error "Caught Segmentation fault" in all cases.However, version 2.1.2 can start up normally.

The detailed error in version 3.9.1 is as follows:
[2025-01-06 11:00:35.391][39][critical][backtrace] [./source/server/backtrace.h:104] Caught Segmentation fault, suspect faulting address 0x0
[2025-01-06 11:00:35.391][39][critical][backtrace] [./source/server/backtrace.h:91] Backtrace (use tools/stack_decode.py to get line numbers):
[2025-01-06 11:00:35.391][39][critical][backtrace] [./source/server/backtrace.h:92] Envoy version: 6637fd1bab315774420f3c3d97488fedb7fc710f/1.27.2/Clean/RELEASE/BoringSSL

I noticed that Envoy retrieves the number of cores from the host machine when starting, which may conflict with Kubernetes' Guaranteed QoS. Could this be the cause of the issue?

@dosubot dosubot bot added the t:bug Something isn't working label Dec 20, 2024
@cindymullins-dw
Copy link
Contributor

Hi @hushuai1559, this looks like an Envoy question rather than Emissary-ingress. Can you pls clarify?

@hushuai1559
Copy link
Author

Hi @hushuai1559, this looks like an Envoy question rather than Emissary-ingress. Can you pls clarify?

This issue is likely related to Emissary Ingress. In the same scenario, Emissary Ingress version 2.1.2 starts up normally, but after upgrading to version 2.2.2, a Segmentation fault in Envoy causes Emissary Ingress to fail to start. When I use the latest version, 3.9.1, the same problem occurs. Additionally, when I replace the Envoy in version 3.9.1 with the 1.17.4-dev version of Envoy (which is the version used by Emissary Ingress 2.1.2), it still fails to start eventually. Since Emissary Ingress's Envoy is started based on the xDS approach, it is difficult to rule out Emissary Ingress as the root cause. Moreover, when I start Envoy in a static configuration mode on the same Kubernetes cluster, no errors occur.

@hushuai1559
Copy link
Author

hushuai1559 commented Jan 6, 2025

By using the following command to create a Minikube cluster on a Linux machine and then deploying Emissary Ingress 3.9.1 via the Helm method recommended by the official documentation, you will encounter the same error as described.

minikube start --force
--kubernetes-version=v1.28.9
--container-runtime=containerd
--extra-config=kubelet.kube-reserved='cpu=0.5,memory=512Mi'
--extra-config=kubelet.cpu-manager-policy='static'
--extra-config=kubelet.topology-manager-policy='single-numa-node'

@hushuai1559 hushuai1559 changed the title When deploying a Guaranteed QoS in k8s, Envoy occasionally restarts When deploying a Guaranteed QoS in k8s, Emissary Ingress fails to start. Jan 6, 2025
@hushuai1559 hushuai1559 changed the title When deploying a Guaranteed QoS in k8s, Emissary Ingress fails to start. When deploying a Guaranteed Pod in k8s, Emissary Ingress fails to start. Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
t:bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants