Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce builder #559

Merged
merged 1 commit into from
Jan 9, 2025
Merged

Introduce builder #559

merged 1 commit into from
Jan 9, 2025

Conversation

kvaps
Copy link
Member

@kvaps kvaps commented Jan 3, 2025

Signed-off-by: Andrei Kvapil [email protected]

Summary by CodeRabbit

  • New Features

    • Added configuration for Kubernetes builder environment
    • Introduced Talos imager configuration with version v1.8.4
    • Implemented garbage collection policies for OCI worker storage management
  • Chores

    • Updated Makefile to streamline image building process
    • Added Kubernetes deployment templates for builder sandbox
  • Infrastructure

    • Created new configuration files for builder package
    • Enhanced build and deployment workflows

Signed-off-by: Andrei Kvapil <[email protected]>
@dosubot dosubot bot added the size:L This PR changes 100-499 lines, ignoring generated files. label Jan 3, 2025
Copy link
Contributor

coderabbitai bot commented Jan 3, 2025

Walkthrough

This pull request introduces a comprehensive setup for a Kubernetes-based builder environment within the Cozy stack. The changes include creating a Helm chart for the builder, defining a Makefile for managing the builder deployment, configuring OCI worker storage and garbage collection, and setting up a Kubernetes sandbox for a Talos imager. The modifications aim to streamline the image building process by leveraging Kubernetes infrastructure and providing flexible configuration options for resource management.

Changes

File Changes
packages/core/builder/Chart.yaml New Helm chart file with apiVersion: v2, name: builder, and version: 0.0.0
packages/core/builder/Makefile Added targets for managing builder deployment: help, show, apply, diff, delete, wait-for-builder
packages/core/builder/config.toml Added OCI worker configuration with garbage collection policies and storage management settings
packages/core/builder/templates/sandbox.yaml New Kubernetes namespace and deployment configuration for Talos imager
packages/core/builder/values.yaml Added Talos imager image configuration
packages/core/installer/Makefile Updated image build targets to use Kubernetes builder and added run-builder target

Sequence Diagram

sequenceDiagram
    participant Makefile as Installer Makefile
    participant Builder as Kubernetes Builder
    participant Imager as Talos Imager Pod

    Makefile->>Builder: run-builder
    Builder-->>Makefile: Builder ready
    Makefile->>Imager: Execute image build commands
    Imager-->>Makefile: Image build complete
Loading

Poem

🐰 In Kubernetes' embrace so bright,
A builder hops with all its might
Garbage collected, images spun
Talos dancing, oh what fun!
CodeRabbit's magic, smooth and clean
A builder's dream, a techno scene 🚀


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@kvaps kvaps requested a review from klinch0 January 3, 2025 10:33
@dosubot dosubot bot added the enhancement New feature or request label Jan 3, 2025
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (4)
packages/core/builder/config.toml (1)

9-11: Consider adding keepDuration to the second GC policy

The second policy lacks a keepDuration parameter while keeping a larger storage quota (~50GB). This could lead to unbounded storage growth over time. Consider adding a duration limit to ensure proper cleanup of older artifacts.

  [[worker.oci.gcpolicy]]
    all = true
    keepBytes = 53687091200
+   keepDuration = 2592000  # 30 days retention
packages/core/builder/Makefile (2)

4-4: Add error handling for TALOS_VERSION extraction

The version extraction could fail silently if the YAML file is missing or malformed. Consider adding validation:

-TALOS_VERSION=$(shell awk '/^version:/ {print $$2}' ../installer/images/talos/profiles/installer.yaml)
+TALOS_VERSION=$(shell awk '/^version:/ {print $$2}' ../installer/images/talos/profiles/installer.yaml)
+$(if $(TALOS_VERSION),,$(error Failed to extract TALOS_VERSION))

33-35: Add timeout and feedback to wait conditions

The wait commands could hang indefinitely. Add timeout and better error handling:

 wait-for-builder:
-	kubectl wait deploy --for=condition=Progressing -n $(NAMESPACE) $(NAME)-talos-imager
-	kubectl wait pod --for=condition=Ready -n $(NAMESPACE) -l app=$(NAME)-talos-imager
+	kubectl wait deploy --timeout=5m --for=condition=Progressing -n $(NAMESPACE) $(NAME)-talos-imager || (echo "Deployment failed to progress" && exit 1)
+	kubectl wait pod --timeout=5m --for=condition=Ready -n $(NAMESPACE) -l app=$(NAME)-talos-imager || (echo "Pods failed to become ready" && exit 1)
packages/core/installer/Makefile (1)

22-23: Add error handling for configuration updates

The configuration update logic could be more robust:

  1. Add checks for file existence
  2. Validate YAML structure before updating
  3. Consider using yq for both reading and writing to ensure consistent YAML handling
-	IMAGE=$$(awk '/^version:/ {print $$2}' images/talos/profiles/installer.yaml) \
-		yq -i '.talos.imager.image = "ghcr.io/siderolabs/imager:" + strenv(IMAGE)' ../builder/values.yaml
+	@if [ ! -f images/talos/profiles/installer.yaml ]; then \
+		echo "Error: installer.yaml not found"; \
+		exit 1; \
+	fi
+	@if [ ! -f ../builder/values.yaml ]; then \
+		echo "Error: builder values.yaml not found"; \
+		exit 1; \
+	fi
+	IMAGE=$$(yq e '.version' images/talos/profiles/installer.yaml) \
+		yq -i '.talos.imager.image = "ghcr.io/siderolabs/imager:" + env(IMAGE)' ../builder/values.yaml
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 26388c7 and 5cbe9b5.

📒 Files selected for processing (6)
  • packages/core/builder/Chart.yaml (1 hunks)
  • packages/core/builder/Makefile (1 hunks)
  • packages/core/builder/config.toml (1 hunks)
  • packages/core/builder/templates/sandbox.yaml (1 hunks)
  • packages/core/builder/values.yaml (1 hunks)
  • packages/core/installer/Makefile (3 hunks)
✅ Files skipped from review due to trivial changes (2)
  • packages/core/builder/values.yaml
  • packages/core/builder/Chart.yaml
🧰 Additional context used
🪛 yamllint (1.35.1)
packages/core/builder/templates/sandbox.yaml

[error] 12-12: syntax error: expected , but found ''

(syntax)

🔇 Additional comments (2)
packages/core/builder/config.toml (1)

2-3: Specify units for gckeepstorage value

The gckeepstorage parameter value (50000) lacks unit specification. Please clarify if this represents bytes, megabytes, or another unit to avoid ambiguity.

packages/core/installer/Makefile (1)

27-27: Verify builder readiness in image targets

While adding run-builder as a prerequisite is good, consider adding explicit verification that the builder is operational before proceeding with image operations.

Consider adding a check like this to the image targets:

 image-talos: run-builder
+	@kubectl wait --for=condition=ready pod -l app=builder-talos-imager -n cozy-builder --timeout=60s
 	test -f ../../../_out/assets/installer-amd64.tar || make talos-installer
 	skopeo copy docker-archive:../../../_out/assets/installer-amd64.tar docker://$(REGISTRY)/talos:$(call settag,$(TALOS_VERSION))

Also applies to: 42-42, 44-44, 46-46

Comment on lines +31 to +33
securityContext:
privileged: true
command:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Review security implications of privileged container

Running containers in privileged mode with host device access poses significant security risks. Consider:

  1. Implementing more granular capabilities instead of full privileged mode
  2. Adding SecurityContext restrictions at the pod level
  3. Restricting the deployment to specific nodes using nodeSelector

Comment on lines +26 to +28
automountServiceAccountToken: false
terminationGracePeriodSeconds: 1
containers:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve container health management

Current configuration has several operational concerns:

  1. Using sleep infinity makes it difficult to implement proper health checks
  2. No resource limits specified
  3. No readiness/liveness probes defined

Consider adding:

    spec:
      automountServiceAccountToken: false
      terminationGracePeriodSeconds: 1
+     resources:
+       limits:
+         cpu: "1"
+         memory: "1Gi"
+       requests:
+         cpu: "100m"
+         memory: "128Mi"
      containers:
      - name: imagere
        image: "{{ .Values.talos.imager.image }}"
+       livenessProbe:
+         exec:
+           command:
+           - sh
+           - -c
+           - "ps aux | grep -v grep | grep sleep"
+         initialDelaySeconds: 5
+         periodSeconds: 10

Also applies to: 33-35

Comment on lines +29 to +31
delete: ## Remove builder sandbox from existing Kubernetes cluster.
kubectl delete deploy -n $(NAMESPACE) $(NAME)-talos-imager
docker buildx rm buildkit-$(NAME)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Enhance cleanup process in delete target

The delete target could leave orphaned resources. Consider:

  1. Adding namespace cleanup if empty
  2. Adding error handling for non-existent resources
  3. Cleaning up any persistent volumes/claims
 delete: ## Remove builder sandbox from existing Kubernetes cluster.
-	kubectl delete deploy -n $(NAMESPACE) $(NAME)-talos-imager
-	docker buildx rm buildkit-$(NAME)
+	-kubectl delete deploy -n $(NAMESPACE) $(NAME)-talos-imager || true
+	-docker buildx rm buildkit-$(NAME) || true
+	@echo "Checking for orphaned resources in namespace..."
+	@if [ -z "$$(kubectl get all -n $(NAMESPACE) -o name)" ]; then \
+		kubectl delete namespace $(NAMESPACE) || true; \
+	fi
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
delete: ## Remove builder sandbox from existing Kubernetes cluster.
kubectl delete deploy -n $(NAMESPACE) $(NAME)-talos-imager
docker buildx rm buildkit-$(NAME)
delete: ## Remove builder sandbox from existing Kubernetes cluster.
-kubectl delete deploy -n $(NAMESPACE) $(NAME)-talos-imager || true
-docker buildx rm buildkit-$(NAME) || true
@echo "Checking for orphaned resources in namespace..."
@if [ -z "$$(kubectl get all -n $(NAMESPACE) -o name)" ]; then \
kubectl delete namespace $(NAMESPACE) || true; \
fi

Comment on lines +65 to 66
kubectl exec -i -n cozy-builder deploy/builder-talos-imager -- imager --tar-to-stdout - | \
tar -C ../../../_out/assets -xzf-
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Enhance reliability of kubectl exec operations

The kubectl exec pipeline could be more robust with proper error handling and timeouts.

-		kubectl exec -i -n cozy-builder deploy/builder-talos-imager -- imager --tar-to-stdout - | \
-		tar -C ../../../_out/assets -xzf-
+		if ! kubectl exec -i -n cozy-builder deploy/builder-talos-imager --timeout=5m -- \
+			imager --tar-to-stdout - > /tmp/imager.tar; then
+			echo "Error: imager command failed"
+			exit 1
+		fi
+		if ! tar -C ../../../_out/assets -xzf /tmp/imager.tar; then
+			echo "Error: failed to extract imager output"
+			rm -f /tmp/imager.tar
+			exit 1
+		fi
+		rm -f /tmp/imager.tar
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
kubectl exec -i -n cozy-builder deploy/builder-talos-imager -- imager --tar-to-stdout - | \
tar -C ../../../_out/assets -xzf-
if ! kubectl exec -i -n cozy-builder deploy/builder-talos-imager --timeout=5m -- \
imager --tar-to-stdout - > /tmp/imager.tar; then
echo "Error: imager command failed"
exit 1
fi
if ! tar -C ../../../_out/assets -xzf /tmp/imager.tar; then
echo "Error: failed to extract imager output"
rm -f /tmp/imager.tar
exit 1
fi
rm -f /tmp/imager.tar

Comment on lines +68 to +69
run-builder:
make -C ../builder/ apply wait-for-builder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Declare run-builder as .PHONY target and fix dependency

The run-builder target needs two improvements:

  • It should be declared as .PHONY since it doesn't produce a file output
  • The wait-for-builder target exists in the builder Makefile but it's not properly chained as a dependency
+.PHONY: run-builder
 run-builder:
-	make -C ../builder/ apply wait-for-builder
+	make -C ../builder/ apply
+	make -C ../builder/ wait-for-builder

The wait-for-builder target in the builder Makefile waits for Kubernetes resources to be ready, so it should be executed after the apply target completes, not as a single command.

🔗 Analysis chain

Declare run-builder as .PHONY target

The run-builder target should be declared as .PHONY since it doesn't produce a file output.

+.PHONY: run-builder
 run-builder:
 	make -C ../builder/ apply wait-for-builder

Also, verify that the wait-for-builder target exists in the builder Makefile:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check if wait-for-builder target exists in builder Makefile
grep -A 2 "^wait-for-builder:" ../builder/Makefile

Length of output: 105


Script:

#!/bin/bash
# First, let's find the builder Makefile location
fd Makefile --type f

# Then check for wait-for-builder target in all Makefiles
rg "^wait-for-builder:" --type make

Length of output: 2836


Script:

#!/bin/bash
# Let's check the implementation of wait-for-builder target
rg -A 3 "^wait-for-builder:" packages/core/builder/Makefile

Length of output: 249

@kvaps kvaps merged commit 0e72887 into main Jan 9, 2025
2 checks passed
@kvaps kvaps deleted the builder branch January 9, 2025 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants