Inconsistent and Prolonged Pod Restart Times in Kubernetes Cluster #82

JulieHoaglandSorensen · 2025-01-03T16:42:53Z

Description

Certain pods in Kubernetes clusters managed by Bottlenetes experience extended restart times. This issue is inconsistent across pods but frequently occurs, particularly during volume mounting or readiness probe execution. The long wait times disrupt user experience and may affect high-availability use cases.

Expected Behavior:
Pods should restart within a predictable and minimal time frame (e.g., ~15 seconds), ensuring seamless operation.

Actual Behavior:

Some pods restart within the expected time frame (~15 seconds).
Others experience extended delays, ranging from 40 seconds to over 2 minute.
Logs indicate delays during volume mounting and readiness probe execution.

Reproduction

Pod Restart Issue

Fork the Bottlenetes repository to your own GitHub account.

Clone your forked repository and follow the Quickstart instructions in the README to set up the application.

Deploy the Application: Deploy the Bottlenetes application on a Kubernetes cluster (e.g., Minikube or AWS EKS):
kubectl apply -f your-k8s-deployment.yaml

Restart Pods: From the Bottlenetes WebUI, select one pod (excluding the pods in the control plane) from the heatmap, and click "restart pod" button.

Measure Restart Times: Observe and record the restart times for all pods in the deployment.

Identify pods that restart within the expected time frame (~30 seconds).
Note any pods that take significantly longer (~2-5 minutes) to restart.

Review Logs
Check pod logs for any delays related to volume mounting, readiness probe execution, or other potential bottlenecks.

System information

System Information
Environment: Kubernetes clusters on Minikube and AWS EKS.
OS: MacOS.
Node.js Version: v18.17.0.
Cluster Tools: Kubernetes, Helm, Prometheus.

Dependencies

Primary Dependencies:
@emotion/react: v11.14.0
@emotion/styled: v11.14.0
@kubernetes/client-node: v0.22.3
@mui/material: v6.3.0
axios: v1.7.9
bcrypt: v5.1.1
body-parser: v1.20.3
chart.js: v4.4.7
concurrently: v9.1.0
cookie-parser: v1.4.7
cors: v2.8.5
dotenv: v16.4.7
express: v4.21.2
express-session: v1.18.1
express-validator: v7.2.0
jsonwebtoken: v9.0.2
lucide-react: v0.462.0
moment: v2.30.1
openai: v4.74.0
passport: v0.7.0
passport-github2: v0.1.12
passport-google-oauth20: v2.0.0
pg: v8.13.1
pg-format: v1.0.4
pg-hstore: v2.3.4
react: v18.3.1
react-chartjs-2: v5.2.0
react-dom: v18.3.1
react-draggable: v4.4.6
react-icons: v5.4.0
react-markdown: v9.0.1
react-router-dom: v7.0.2
sequelize: v6.37.5
wait-on: v8.0.1
zustand: v5.0.2

DevDependencies:
@eslint/js: v9.15.0
@tailwindcss/typography: v0.5.15
@types/bcrypt: v5.0.2
@types/chart.js: v2.9.41
@types/cookie-parser: v1.4.8
@types/cors: v2.8.17
@types/express: v5.0.0
@types/jsonwebtoken: v9.0.7
@types/passport: v1.0.17
@types/passport-github2: v1.2.9
@types/passport-google-oauth20: v2.0.16
@types/react: v18.3.12
@types/react-dom: v18.3.1
@vitejs/plugin-react: v4.3.4
autoprefixer: v10.4.20
eslint: v9.15.0
eslint-plugin-react: v7.37.2
eslint-plugin-react-hooks: v5.0.0
eslint-plugin-react-refresh: v0.4.14
globals: v15.12.0
nodemon: v3.1.9
postcss: v8.4.49
prettier: v3.4.1
prettier-plugin-tailwindcss: v0.6.9
tailwindcss: v3.4.17
ts-node: v10.9.2
typescript: v5.7.2
typescript-eslint: v8.19.0
vite: v6.0.6

Additional information

Priority: Medium-High
Extended pod restart times impact system reliability for high-availability use cases.

👨‍👧‍👦 Contributing

🙋‍♂️ Yes, I'd love to make a PR to fix this bug!

The text was updated successfully, but these errors were encountered:

JulieHoaglandSorensen added the bug Something isn't working label Jan 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inconsistent and Prolonged Pod Restart Times in Kubernetes Cluster #82

Inconsistent and Prolonged Pod Restart Times in Kubernetes Cluster #82

JulieHoaglandSorensen commented Jan 3, 2025 •

edited by random-letter-generator

Loading

Inconsistent and Prolonged Pod Restart Times in Kubernetes Cluster #82

Inconsistent and Prolonged Pod Restart Times in Kubernetes Cluster #82

Comments

JulieHoaglandSorensen commented Jan 3, 2025 • edited by random-letter-generator Loading

Description

Reproduction

System information

Additional information

👨‍👧‍👦 Contributing

JulieHoaglandSorensen commented Jan 3, 2025 •

edited by random-letter-generator

Loading