Skip to content

Commit

Permalink
Merge branch 'main' into fix-ip
Browse files Browse the repository at this point in the history
  • Loading branch information
JooyoungPark73 authored Apr 22, 2024
2 parents bb82821 + 28fbb7a commit ad0cf7a
Show file tree
Hide file tree
Showing 18 changed files with 176 additions and 10 deletions.
4 changes: 3 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,14 @@
### Added

- Added support for [`OpenYurt`](https://openyurt.io/), an open platform that extends upstream Kubernetes to run on edge node pools. More details on how to the `Knative-atop-OpenYurt` mode are described [here](scripts/openyurt-deployer/README.md).
- Added support for arm64 ubuntu 18.04 in stock-only setup with [setup scripts](./scripts/setup.go).
- Added support for [`K8 Power Manager`](https://networkbuilders.intel.com/solutionslibrary/power-manager-a-kubernetes-power-operator-technology-guide), a Kubernetes operator designed to manage and optimize power consumption in a Kubernetes cluster. More details are described [here](docs/power_manager.md).


### Changed

- Removed the utils and examples from the vHive repo, moved to [vSwarm](https://github.com/vhive-serverless/vSwarm).
- Bumped Go to 1.21, Kubernetes to v1.29, Knative to v1.13, Istio to 1.20.2, MetalLB to 0.14.3, Calico to 3.27.2.
- Bumped Go to 1.21, Kubernetes to v1.29, Knative to v1.13, Istio to 1.20.2, MetalLB to 0.14.3, Calico to 3.27.3.
- Made the automatic patching of the knative-serving and calico manifests instead of storing the patched manifests in the repo.

### Fixed
Expand Down
2 changes: 1 addition & 1 deletion configs/registry/repository-update-hosts.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ spec:
echo "Done."
containers:
- name: init-container-did-the-work
image: gcr.io/google_containers/pause-amd64:3.1@sha256:59eec8837a4d942cc19a52b8c09ea75121acc38114a2c68b98983ce9356b8610
image: registry.k8s.io/pause:3.6
terminationGracePeriodSeconds: 30
volumes:
- name: etchosts
Expand Down
2 changes: 1 addition & 1 deletion configs/setup/kube.json
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,5 @@
"ApiserverPort": "6443",
"ApiserverToken": "",
"ApiserverTokenHash": "",
"CalicoVersion": "3.27.2"
"CalicoVersion": "3.27.3"
}
2 changes: 1 addition & 1 deletion configs/setup/system.json
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
"KubeRepoUrl": "https://pkgs.k8s.io/core:/stable:/v1.29/deb/",
"PmuToolsRepoUrl": "https://github.com/vhive-serverless/pmu-tools",
"ProtocVersion": "3.19.4",
"ProtocDownloadUrlTemplate": "https://github.com/protocolbuffers/protobuf/releases/download/v%s/protoc-%s-linux-x86_64.zip",
"ProtocDownloadUrlTemplate": "https://github.com/protocolbuffers/protobuf/releases/download/v%s/protoc-%s-linux-%s.zip",
"LogVerbosity": 0,
"YQDownloadUrlTemplate": "https://github.com/mikefarah/yq/releases/latest/download/yq_linux_%s"
}
26 changes: 26 additions & 0 deletions docs/power_manager.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# K8s Power Manager

## Components
1. **Power Manager Controller**: ensures the actual state matches the desired state of the cluster.
2. **Power Config Controller**: sees the power config created by user and deploys Power Node Agents onto each node specified using a daemon set.
- power node selector: A key/value map used to define a list of node labels that a node must satisfy for the operator's node
agent to be deployed.
- power profiles: The list of power profiles that the user wants available on the nodes
3. **Power Node Agent**: containerized applications used to communicate with the node's Kubelet pod resources endpoint to discover the exact CPUs that
are allocated per container and tune frequency of the cores as requested

4. **Power Profile**: predefined configuration that specifies how the system should manage power consumption for various components such as CPUs. It includes settings applied to host level such as CPU frequency, governor etc.

4. **Power Workload**: the object used to define the lists of CPUs configured with a particular Power Profile. A power workload is created for each Power Profile on each Node with the Power Node Agent deployed. A power workload is represented in the Intel Power Optimization Library by a Pool. The Pools hold the values of the Power Profile used, their frequencies, and the CPUs that need to be configured. The creation of the Pool – and any additions to the Pool – then
carries out the changes.

## Setup
Execute the following below **as a non-root user with sudo rights** using **bash**:
1. Follow [a quick-start guide](quickstart_guide.md) to set up a Knative cluster.
2. On master node, export NODE_NAME to your node name and run K8s power manager set up script:
```bash
export NODE_NAME= *Name of the node you want to apply shared profile on*
./scripts/power_manager/setup_power_manager.sh;
```

This will install and configure the Kubernetes Power Manager for managing power consumption in a Kubernetes cluster. It clones the Power Manager repository, sets up the necessary namespace, service account, and Role-based Access Control rule, then generates and installs custom resource definitions, and deploys the Power Manager controller. It also applies a Power config to manage the power node agents, a shared profile for specifying CPU frequencies, and a shared workload for applying the CPU tuning settings.
2 changes: 1 addition & 1 deletion docs/quickstart_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ To see how to setup a single node cluster with stock-only or gVisor, see [Develo
## I. Host platform requirements
### 1. Hardware
1. Two x64 servers in the same network.
- We have not tried vHive with Arm but it may not be hard to port because Firecracker supports Arm64 ISA.
- vHive is now compatible with arm64 Ubuntu 18.04 servers for multi-node clusters using the `stock-only` setting. Other configurations and OS versions have not been tested at this time.
2. Hardware support for virtualization and KVM.
- Nested virtualization is supported provided that KVM is available.
3. The root partition of the host filesystem should be mounted on an **SSD**. That is critical for snapshot-based cold-starts.
Expand Down
1 change: 1 addition & 0 deletions go.work
Original file line number Diff line number Diff line change
Expand Up @@ -5,4 +5,5 @@ use (
./scripts
./scripts/github_runner
./scripts/openyurt-deployer
./power_manager
)
1 change: 1 addition & 0 deletions go.work.sum
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ github.com/go-logr/logr v1.2.0/go.mod h1:jdQByPbusPIv2/zmleS9BjJVeZ6kBagPoEUsqbV
github.com/go-logr/logr v1.2.1/go.mod h1:jdQByPbusPIv2/zmleS9BjJVeZ6kBagPoEUsqbVz/1A=
github.com/go-logr/logr v1.2.2/go.mod h1:jdQByPbusPIv2/zmleS9BjJVeZ6kBagPoEUsqbVz/1A=
github.com/go-logr/stdr v1.2.0/go.mod h1:YkVgnZu1ZjjL7xTxrfm/LLZBfkhTqSR1ydtm6jTKKwI=
github.com/go-logr/stdr v1.2.2 h1:hSWxHoqTgW2S2qGc0LTAI563KZ5YKYRhT3MFKZMbjag=
github.com/go-logr/stdr v1.2.2/go.mod h1:mMo/vtBO5dYbehREoey6XUKy/eSumjCCveDpRre4VKE=
github.com/go-ole/go-ole v1.2.4/go.mod h1:XCwSNxSkXRo4vlyPy93sltvi/qJq0jqQhjqQNIwKuxM=
github.com/go-openapi/analysis v0.21.2/go.mod h1:HZwRk4RRisyG8vx2Oe6aqeSQcoxRp47Xkp3+K6q+LdY=
Expand Down
3 changes: 3 additions & 0 deletions power_manager/go.mod
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
module github.com/vhive-serverless/vhive/power_manager

go 1.21
35 changes: 35 additions & 0 deletions power_manager/util.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
package power_manager

import (
"fmt"
"os/exec"
)

func SetPowerProfileToNode(powerprofileName string, nodeName string, minFreq int64, maxFreq int64) error {
// powerConfig
command := fmt.Sprintf("kubectl apply -f - <<EOF\napiVersion: \"power.intel.com/v1\"\nkind: PowerConfig\nmetadata:\n name: power-config\n namespace: intel-power\nspec:\n powerNodeSelector:\n kubernetes.io/os: linux\n powerProfiles:\n - \"performance\"\nEOF")
cmd := exec.Command("bash", "-c", command)
_, err := cmd.CombinedOutput()
if err != nil {
return err
}

// performanceProfile w freq
command = fmt.Sprintf("kubectl apply -f - <<EOF\napiVersion: \"power.intel.com/v1\"\nkind: PowerProfile\nmetadata:\n name: %s\n namespace: intel-power\nspec:\n name: \"%s\"\n max: %d\n min: %d\n shared: true\n governor: \"performance\"\nEOF", powerprofileName, powerprofileName, minFreq, maxFreq)
cmd = exec.Command("bash", "-c", command)

_, err = cmd.CombinedOutput()
if err != nil {
return err
}

// apply to node
command = fmt.Sprintf("kubectl apply -f - <<EOF\napiVersion: \"power.intel.com/v1\"\nkind: PowerWorkload\nmetadata:\n name: %s-%s-workload\n namespace: intel-power\nspec:\n name: \"%s-%s-workload\"\n allCores: true\n powerNodeSelector:\n kubernetes.io/hostname: %s\n powerProfile: \"%s\"\nEOF", powerprofileName, nodeName, powerprofileName, nodeName, nodeName, powerprofileName)
cmd = exec.Command("bash", "-c", command)

_, err = cmd.CombinedOutput()
if err != nil {
return err
}
return nil
}
12 changes: 11 additions & 1 deletion scripts/configs/system.go
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,15 @@ var System = SystemEnvironmentStruct{
}

func (system *SystemEnvironmentStruct) GetProtocDownloadUrl() string {
return fmt.Sprintf(system.ProtocDownloadUrlTemplate, system.ProtocVersion, system.ProtocVersion)
unameArch := system.CurrentArch
switch unameArch {
case "amd64":
unameArch = "x86_64"
case "arm64":
unameArch = "aarch_64"
default:
}
return fmt.Sprintf(system.ProtocDownloadUrlTemplate, system.ProtocVersion, system.ProtocVersion, unameArch)
}

func (system *SystemEnvironmentStruct) GetContainerdDownloadUrl() string {
Expand All @@ -80,6 +88,8 @@ func (system *SystemEnvironmentStruct) GetRunscDownloadUrl() string {
switch unameArch {
case "amd64":
unameArch = "x86_64"
case "arm64":
unameArch = "aarch_64"
default:
}

Expand Down
3 changes: 3 additions & 0 deletions scripts/install_go.sh
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,9 @@ case $arch in
'x86_64')
arch='amd64'
;;
'aarch64')
arch='arm64'
;;
*)
echo "Unsupported architecture $arch"
exit 1
Expand Down
12 changes: 12 additions & 0 deletions scripts/power_manager/power_config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
apiVersion: "power.intel.com/v1"
kind: PowerConfig
metadata:
name: power-config
namespace: intel-power
spec:
# Add labels here for the Nodes you want the PowerNodeAgent to be applied to
powerNodeSelector:
kubernetes.io/os: linux
# Add wanted PowerProfiles here; valid entries are as follows: performance, balance-performance, balance-power
powerProfiles:
- "performance"
49 changes: 49 additions & 0 deletions scripts/power_manager/setup_power_manager.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
#!/bin/bash

# MIT License
#
# Copyright (c) 2024 Eang Sokunthea
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.

# Install K8 Power Manager
git clone https://github.com/intel/kubernetes-power-manager $HOME/kubernetes-power-manager

# Set up the necessary Namespace, Service Account, and RBAC rules for the Kubernetes Power Manager
kubectl apply -f $HOME/kubernetes-power-manager/config/rbac/namespace.yaml
kubectl apply -f $HOME/kubernetes-power-manager/config/rbac/rbac.yaml

# Generate the CRD templates, create the Custom Resource Definitions, and install the CRDs and Built Docker images locally
cd $HOME/kubernetes-power-manager
make

# Apply Power Manager Controller
kubectl apply -f $HOME/kubernetes-power-manager/config/manager/manager.yaml

# Apply PowerConfig -> create the power-node-agent DaemonSet that manages the Power Node Agent pods.
kubectl apply -f $HOME/vhive/scripts/power_manager/power_config.yaml

# Apply Profile. U can modify the spec in the shared-profile.yaml file
kubectl apply -f $HOME/vhive/scripts/power_manager/shared-profile.yaml

# Apply the shared PowerWorkload. All CPUs (except reservedCPUs specified in this yaml file) will be tuned to the specified frequency in shared-profile.yaml
envsubst < $HOME/vhive/scripts/power_manager/shared-workload.yaml | kubectl apply -f -

kubectl get powerprofiles -n intel-power
kubectl get powerworkloads -n intel-power
11 changes: 11 additions & 0 deletions scripts/power_manager/shared-profile.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
apiVersion: power.intel.com/v1
kind: PowerProfile
metadata:
name: performance
namespace: intel-power
spec:
name: "performance"
max: 1200
min: 1200
shared: true
governor: "performance"
13 changes: 13 additions & 0 deletions scripts/power_manager/shared-workload.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
apiVersion: "power.intel.com/v1"
kind: PowerWorkload
metadata:
name: performance-$NODE_NAME-workload
namespace: intel-power
spec:
name: "performance-$NODE_NAME-workload"
allCores: true
powerNodeSelector:
# The label must be as below, as this workload will be specific to the Node
kubernetes.io/hostname: $NODE_NAME
# Replace this value with the intended shared PowerProfile
powerProfile: "performance"
4 changes: 2 additions & 2 deletions scripts/utils/system.go
Original file line number Diff line number Diff line change
Expand Up @@ -92,9 +92,9 @@ func ExecShellCmd(cmd string, pars ...any) (string, error) {
// Detect current architecture
func DetectArch() error {
switch configs.System.CurrentArch {
case "amd64":
case "amd64", "arm64":
default:
// Only amd64(x86_64) are supported at present
// amd64(x86_64) and arm64(aarch64) are supported at present
FatalPrintf("Unsupported architecture: %s\n", configs.System.CurrentArch)
return &ShellError{"Unsupported architecture", 1}
}
Expand Down
4 changes: 2 additions & 2 deletions scripts/utils/utils.go
Original file line number Diff line number Diff line change
Expand Up @@ -166,10 +166,10 @@ func TurnOffAutomaticUpgrade() error {
}

func InstallYQ() {
InfoPrintf("Downloading yq for yaml parsing of template")
WaitPrintf("Downloading yq for yaml parsing of template")
yqUrl := fmt.Sprintf(configs.System.YqDownloadUrlTemplate, configs.System.CurrentArch)
_, err := ExecShellCmd(`sudo wget %s -O /usr/bin/yq && sudo chmod +x /usr/bin/yq`, yqUrl)
CheckErrorWithMsg(err, "Failed to add yq!\n")
CheckErrorWithTagAndMsg(err, "Failed to add yq!\n")
}

func GetNodeIP() (string, error) {
Expand Down

0 comments on commit ad0cf7a

Please sign in to comment.