Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Don't assume KubeadmControlPlane for nodeReuse #1311

Merged
merged 1 commit into from
Dec 12, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 20 additions & 23 deletions baremetal/metal3machine_manager.go
Original file line number Diff line number Diff line change
Expand Up @@ -578,7 +578,7 @@ func (m *MachineManager) Delete(ctx context.Context) error {
if m.Cluster.DeletionTimestamp.IsZero() {
// Fetch corresponding Metal3MachineTemplate, to see if nodeReuse
// feature is enabled. If set to true, check the machine role. In case
// machine role is ControlPlane, set nodeReuseLabelName to KubeadmControlPlane
// machine role is ControlPlane, set nodeReuseLabelName to ControlPlane
// name, otherwise to MachineDeployment name.
m.Log.Info("Getting Metal3MachineTemplate")
m3mt := &infrav1.Metal3MachineTemplate{}
Expand Down Expand Up @@ -609,15 +609,15 @@ func (m *MachineManager) Delete(ctx context.Context) error {
}
// Check if machine is ControlPlane
if m.isControlPlane() {
hardys marked this conversation as resolved.
Show resolved Hide resolved
// Fetch KubeadmControlPlane name for controlplane machine
m.Log.Info("Fetch KubeadmControlPlane name while deprovisioning host", "host", host.Name)
kcpName, err := m.getKubeadmControlPlaneName(ctx)
// Fetch ControlPlane name for controlplane machine
m.Log.Info("Fetch ControlPlane name while deprovisioning host", "host", host.Name)
cpName, err := m.getControlPlaneName(ctx)
if err != nil {
return err
}
// Set nodeReuseLabelName on the host to KubeadmControlPlane name
m.Log.Info("Setting nodeReuseLabelName in host to fetched KubeadmControlPlane", "host", host.Name, "kubeadmControlPlane", kcpName)
host.Labels[nodeReuseLabelName] = kcpName
// Set nodeReuseLabelName on the host to ControlPlane name
m.Log.Info("Setting nodeReuseLabelName in host to fetched ControlPlane", "host", host.Name, "controlPlane", cpName)
host.Labels[nodeReuseLabelName] = cpName
} else {
// Fetch MachineDeployment name for worker machine
m.Log.Info("Fetch MachineDeployment name while deprovisioning host", "host", host.Name)
Expand Down Expand Up @@ -949,7 +949,7 @@ func consumerRefMatches(consumer *corev1.ObjectReference, m3machine *infrav1.Met
return true
}

// nodeReuseLabelMatches returns true if nodeReuseLabelName matches KubeadmControlPlane or MachineDeployment name on the host.
// nodeReuseLabelMatches returns true if nodeReuseLabelName matches ControlPlane or MachineDeployment name on the host.
func (m *MachineManager) nodeReuseLabelMatches(ctx context.Context, host *bmov1alpha1.BareMetalHost) bool {
if host == nil {
return false
Expand All @@ -958,17 +958,17 @@ func (m *MachineManager) nodeReuseLabelMatches(ctx context.Context, host *bmov1a
return false
}
if m.isControlPlane() {
kcp, err := m.getKubeadmControlPlaneName(ctx)
cp, err := m.getControlPlaneName(ctx)
if err != nil {
return false
}
if host.Labels[nodeReuseLabelName] == "" {
return false
}
if host.Labels[nodeReuseLabelName] != kcp {
if host.Labels[nodeReuseLabelName] != cp {
return false
}
m.Log.Info("nodeReuseLabelName on the host matches KubeadmControlPlane name", "host", host.Name, "kubeadmControlPlane", kcp)
m.Log.Info("nodeReuseLabelName on the host matches ControlPlane name", "host", host.Name, "controlPlane", cp)
return true
}
md, err := m.getMachineDeploymentName(ctx)
Expand Down Expand Up @@ -1650,32 +1650,29 @@ func (m *MachineManager) DissociateM3Metadata(ctx context.Context) error {
return deleteObject(ctx, m.client, metal3DataClaim)
}

// getKubeadmControlPlaneName retrieves the KubeadmControlPlane object corresponding to the CAPI machine.
func (m *MachineManager) getKubeadmControlPlaneName(_ context.Context) (string, error) {
m.Log.Info("Fetching KubeadmControlPlane name")
// getControlPlaneName retrieves the ControlPlane object corresponding to the CAPI machine.
func (m *MachineManager) getControlPlaneName(_ context.Context) (string, error) {
m.Log.Info("Fetching ControlPlane name")
if m.Machine == nil {
return "", errors.New("Could not find corresponding machine object")
}
if m.Machine.ObjectMeta.OwnerReferences == nil {
return "", errors.New("Machine owner reference is not populated")
}
for _, mOwnerRef := range m.Machine.ObjectMeta.OwnerReferences {
if mOwnerRef.Kind != "KubeadmControlPlane" {
continue
}
aGV, err := schema.ParseGroupVersion(mOwnerRef.APIVersion)
if err != nil {
return "", errors.New("Failed to parse the group and version")
}
if aGV.Group != controlplanev1.GroupVersion.Group {
continue
}
hardys marked this conversation as resolved.
Show resolved Hide resolved
// adding prefix to KubeadmControlPlane name in order to be able to differentiate
// KubeadmControlPlane and MachineDeployment in case they have the same names set in the cluster.
m.Log.Info("Fetched KubeadmControlPlane name", "kubeadmControlPlane", "kcp-"+mOwnerRef.Name)
return "kcp-" + mOwnerRef.Name, nil
// adding prefix to ControlPlane name in order to be able to differentiate
// ControlPlane and MachineDeployment in case they have the same names set in the cluster.
m.Log.Info("Fetched ControlPlane name", "controlPlane", "cp-"+mOwnerRef.Name)
return "cp-" + mOwnerRef.Name, nil
}
return "", errors.New("KubeadmControlPlane name is not found")
return "", errors.New("ControlPlane name is not found")
}

// getMachineDeploymentName retrieves the MachineDeployment object name corresponding to the MachineSet.
Expand Down Expand Up @@ -1704,7 +1701,7 @@ func (m *MachineManager) getMachineDeploymentName(ctx context.Context) (string,
continue
}
// adding prefix to MachineDeployment name in order to be able to differentiate
// MachineDeployment and KubeadmControlPlane in case they have the same names set in the cluster.
// MachineDeployment and ControlPlane in case they have the same names set in the cluster.
m.Log.Info("Fetched MachineDeployment name", "machinedeployment", "md-"+msOwnerRef.Name)
return "md-" + msOwnerRef.Name, nil
}
Expand Down
76 changes: 39 additions & 37 deletions baremetal/metal3machine_manager_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ const (
testUserDataSecretName = "worker-user-data"
testMetaDataSecretName = "worker-metadata"
testNetworkDataSecretName = "worker-network-data"
kcpName = "kcp-pool1"
cpName = "cp-pool1"
)

var Bmhuid = types.UID("4d25a2c2-46e4-11ec-81d3-0242ac130003")
Expand Down Expand Up @@ -561,11 +561,11 @@ var _ = Describe("Metal3Machine manager", func() {
},
},
}
hostWithNodeReuseLabelSetToKCP := bmov1alpha1.BareMetalHost{
hostWithNodeReuseLabelSetToCP := bmov1alpha1.BareMetalHost{
ObjectMeta: metav1.ObjectMeta{
Name: "hostWithNodeReuseLabelSetToKCP",
Name: "hostWithNodeReuseLabelSetToCP",
Namespace: namespaceName,
Labels: map[string]string{nodeReuseLabelName: "kcp-test1"},
Labels: map[string]string{nodeReuseLabelName: "cp-test1"},
},
Spec: bmov1alpha1.BareMetalHostSpec{},
Status: bmov1alpha1.BareMetalHostStatus{
Expand All @@ -578,7 +578,7 @@ var _ = Describe("Metal3Machine manager", func() {
ObjectMeta: metav1.ObjectMeta{
Name: "hostWithNodeReuseLabelStateNone",
Namespace: namespaceName,
Labels: map[string]string{nodeReuseLabelName: "kcp-test1"},
Labels: map[string]string{nodeReuseLabelName: "cp-test1"},
},
Spec: bmov1alpha1.BareMetalHostSpec{},
Status: bmov1alpha1.BareMetalHostStatus{
Expand Down Expand Up @@ -654,7 +654,7 @@ var _ = Describe("Metal3Machine manager", func() {
Expect(result.Name).To(Equal(tc.ExpectedHostName))
}
},
Entry("Pick hostWithNodeReuseLabelSetToKCP, which has a matching nodeReuseLabelName", testCaseChooseHost{
Entry("Pick hostWithNodeReuseLabelSetToCP, which has a matching nodeReuseLabelName", testCaseChooseHost{
Machine: &clusterv1.Machine{
ObjectMeta: metav1.ObjectMeta{
Name: machineName,
Expand All @@ -675,9 +675,9 @@ var _ = Describe("Metal3Machine manager", func() {
InfrastructureRef: *infrastructureRef,
},
},
Hosts: &bmov1alpha1.BareMetalHostList{Items: []bmov1alpha1.BareMetalHost{hostWithOtherConsRef, *availableHost, hostWithNodeReuseLabelSetToKCP}},
Hosts: &bmov1alpha1.BareMetalHostList{Items: []bmov1alpha1.BareMetalHost{hostWithOtherConsRef, *availableHost, hostWithNodeReuseLabelSetToCP}},
M3Machine: m3mconfig,
ExpectedHostName: hostWithNodeReuseLabelSetToKCP.Name,
ExpectedHostName: hostWithNodeReuseLabelSetToCP.Name,
}),
Entry("Requeu hostWithNodeReuseLabelStateNone, which has a matching nodeReuseLabelName", testCaseChooseHost{
Machine: &clusterv1.Machine{
Expand Down Expand Up @@ -757,7 +757,7 @@ var _ = Describe("Metal3Machine manager", func() {
ExpectedHostName: hostWithLabel.Name,
},
),
Entry("Choose hosts with a nodeReuseLabelName set to KCP, even without a label selector",
Entry("Choose hosts with a nodeReuseLabelName set to CP, even without a label selector",
testCaseChooseHost{
Machine: &clusterv1.Machine{
ObjectMeta: metav1.ObjectMeta{
Expand All @@ -779,9 +779,9 @@ var _ = Describe("Metal3Machine manager", func() {
InfrastructureRef: *infrastructureRef,
},
},
Hosts: &bmov1alpha1.BareMetalHostList{Items: []bmov1alpha1.BareMetalHost{hostWithNodeReuseLabelSetToKCP}},
Hosts: &bmov1alpha1.BareMetalHostList{Items: []bmov1alpha1.BareMetalHost{hostWithNodeReuseLabelSetToCP}},
M3Machine: m3mconfig,
ExpectedHostName: hostWithNodeReuseLabelSetToKCP.Name,
ExpectedHostName: hostWithNodeReuseLabelSetToCP.Name,
},
),
Entry("Choose the host with the right label", testCaseChooseHost{
Expand Down Expand Up @@ -3981,15 +3981,15 @@ var _ = Describe("Metal3Machine manager", func() {
},
Host: &bmov1alpha1.BareMetalHost{
ObjectMeta: metav1.ObjectMeta{
Name: "kcp-test1",
Name: "cp-test1",
Labels: map[string]string{
nodeReuseLabelName: "kcp-test1",
nodeReuseLabelName: "cp-test1",
"foo": "bar",
},
},
},
expectNodeReuseLabel: true,
expectNodeReuseLabelName: "kcp-test1",
expectNodeReuseLabelName: "cp-test1",
expectMatch: true,
}),
Entry("Should return false if nodeReuseLabelName is empty for host", testCaseNodeReuseLabelMatches{
Expand All @@ -4010,7 +4010,7 @@ var _ = Describe("Metal3Machine manager", func() {
},
Host: &bmov1alpha1.BareMetalHost{
ObjectMeta: metav1.ObjectMeta{
Name: "kcp-test1",
Name: "cp-test1",
Labels: map[string]string{
nodeReuseLabelName: "",
},
Expand All @@ -4037,7 +4037,7 @@ var _ = Describe("Metal3Machine manager", func() {
},
Host: &bmov1alpha1.BareMetalHost{
ObjectMeta: metav1.ObjectMeta{
Name: "kcp-test1",
Name: "cp-test1",
Labels: map[string]string{
nodeReuseLabelName: "abc",
},
Expand Down Expand Up @@ -4117,7 +4117,7 @@ var _ = Describe("Metal3Machine manager", func() {
Host: &bmov1alpha1.BareMetalHost{
ObjectMeta: metav1.ObjectMeta{
Labels: map[string]string{
nodeReuseLabelName: kcpName,
nodeReuseLabelName: cpName,
"foo": "bar",
},
},
Expand Down Expand Up @@ -4146,15 +4146,15 @@ var _ = Describe("Metal3Machine manager", func() {
}),
)

type testCaseGetKubeadmControlPlaneName struct {
Machine *clusterv1.Machine
expectedKcp bool
expectedKcpName string
expectError bool
type testCaseGetControlPlaneName struct {
hardys marked this conversation as resolved.
Show resolved Hide resolved
Machine *clusterv1.Machine
expectedCp bool
expectedCpName string
expectError bool
}

DescribeTable("Test getKubeadmControlPlaneName",
func(tc testCaseGetKubeadmControlPlaneName) {
DescribeTable("Test getControlPlaneName",
func(tc testCaseGetControlPlaneName) {
objects := []client.Object{}
if tc.Machine != nil {
objects = append(objects, tc.Machine)
Expand All @@ -4165,19 +4165,19 @@ var _ = Describe("Metal3Machine manager", func() {
)
Expect(err).NotTo(HaveOccurred())

result, err := machineMgr.getKubeadmControlPlaneName(context.TODO())
result, err := machineMgr.getControlPlaneName(context.TODO())
if tc.expectError {
Expect(err).To(HaveOccurred())
} else {
Expect(err).NotTo(HaveOccurred())
}

if tc.expectedKcp {
Expect(result).To(Equal(tc.expectedKcpName))
if tc.expectedCp {
Expect(result).To(Equal(tc.expectedCpName))
}

},
Entry("Should find the expected kcp", testCaseGetKubeadmControlPlaneName{
Entry("Should find the expected cp", testCaseGetControlPlaneName{
Machine: &clusterv1.Machine{
ObjectMeta: metav1.ObjectMeta{
OwnerReferences: []metav1.OwnerReference{
Expand All @@ -4189,25 +4189,27 @@ var _ = Describe("Metal3Machine manager", func() {
},
},
},
expectError: false,
expectedKcp: true,
expectedKcpName: "kcp-test1",
expectError: false,
expectedCp: true,
expectedCpName: "cp-test1",
}),
Entry("Should not find the expected kcp, kind is not correct", testCaseGetKubeadmControlPlaneName{
Entry("Should find the expected ControlPlane for any Kind/Provider", testCaseGetControlPlaneName{
Machine: &clusterv1.Machine{
ObjectMeta: metav1.ObjectMeta{
OwnerReferences: []metav1.OwnerReference{
{
APIVersion: controlplanev1.GroupVersion.String(),
Kind: "kcp",
Kind: "AnyControlPlane",
Name: "test1",
},
},
},
},
expectError: true,
expectError: false,
expectedCp: true,
expectedCpName: "cp-test1",
}),
Entry("Should not find the expected kcp, API version is not correct", testCaseGetKubeadmControlPlaneName{
Entry("Should not find the expected cp, API version is not correct", testCaseGetControlPlaneName{
Machine: &clusterv1.Machine{
ObjectMeta: metav1.ObjectMeta{
OwnerReferences: []metav1.OwnerReference{
Expand All @@ -4221,11 +4223,11 @@ var _ = Describe("Metal3Machine manager", func() {
},
expectError: true,
}),
Entry("Should not find the expected kcp and error when machine is nil", testCaseGetKubeadmControlPlaneName{
Entry("Should not find the expected cp and error when machine is nil", testCaseGetControlPlaneName{
Machine: nil,
expectError: true,
}),
Entry("Should give an error if Machine.ObjectMeta.OwnerReferences is nil", testCaseGetKubeadmControlPlaneName{
Entry("Should give an error if Machine.ObjectMeta.OwnerReferences is nil", testCaseGetControlPlaneName{
Machine: &clusterv1.Machine{
ObjectMeta: metav1.ObjectMeta{
OwnerReferences: nil,
Expand Down
2 changes: 1 addition & 1 deletion docs/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -535,7 +535,7 @@ When `spec.nodeReuse` field of metal3MachineTemplate is set to `True`, CAPM3
Machine controller:

- Sets `infrastructure.cluster.x-k8s.io/node-reuse` label to the corresponding
CAPI object name (`KubeadmControlPlane` or `MachineDeployment`) on the
CAPI object name (a `controlplane.cluster.x-k8s.io` object such as `KubeadmControlPlane` or a `MachineDeployment`) on the
Copy link
Contributor

@tmmorin tmmorin Nov 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
CAPI object name (a `controlplane.cluster.x-k8s.io` object such as `KubeadmControlPlane` or a `MachineDeployment`) on the
type of owner object (`ControlPlane` for a `controlplane.cluster.x-k8s.io` object such as `KubeadmControlPlane` or a `MachineDeployment`) on the

Copy link
Member

@furkatgofurov7 furkatgofurov7 Nov 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tmmorin with ^, we are changing the explanation of fundamental logic behind the feature, we are not setting infrastructure.cluster.x-k8s.io/node-reuse label to the corresponding type of owner object on BMH, but the name of that object. For example, if CP name is called cp-test1, machine controller will set following label on BMH: infrastructure.cluster.x-k8s.io/node-reuse: cp-test1

Comment on lines 537 to +538
Copy link
Member

@furkatgofurov7 furkatgofurov7 Nov 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tmmorin with ^, we are changing the explanation of fundamental logic behind the feature, we are not setting infrastructure.cluster.x-k8s.io/node-reuse label to the corresponding type of owner object on BMH, but the name of that object. For example, if CP name is called cp-test1, machine controller will set following label on BMH: infrastructure.cluster.x-k8s.io/node-reuse: cp-test1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah yes, indeed

the text could capture the cp- or md- prefix though

Suggested change
CAPI object name (a `controlplane.cluster.x-k8s.io` object such as `KubeadmControlPlane` or a `MachineDeployment`) on the
CAPI object name (a `controlplane.cluster.x-k8s.io` object such as `KubeadmControlPlane` or a `MachineDeployment`), prefixed by `cp-` or`md-` (to distinguish control plane and machine deployments), on the

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is a good point. It seems we have no protection from name collisions currently. I would suggest a separate PR for this though since it would be more of a change in behavior. We will need to consider upgrade scenarios also if we change it (probably just document that CAPM3 should not be upgraded at the same time as the MD or CP is rolling out).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the behaviour in the event of a ControlPlane/MachineDeployment name collision is well defined, the controlplane is checked first so will take precedence, we could perhaps just document that to avoid any change in behaviour?

BareMetalHost during deprovisioning;
- Selects the BareMetalHost that contains
`infrastructure.cluster.x-k8s.io/node-reuse` label and matches exact same CAPI
Expand Down
1 change: 1 addition & 0 deletions test/e2e/pivoting_based_feature_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ var (
*
* Node Reuse:
* This test verifies the feature of reusing the same node after upgrading Kubernetes version in KubeadmControlPlane (KCP) and MachineDeployment (MD) nodes.
* Note that while other controlplane providers are expected to work only KubeadmControlPlane is currently tested.
* - The test starts with a cluster containing 3 KCP (Kubernetes control plane) nodes and 1 MD (MachineDeployment) node.
* - The control plane nodes are untainted to allow scheduling new pods on them.
* - The MachineDeployment is scaled down to 0 replicas, ensuring that all worker nodes will be deprovisioned. This provides 1 BMH (BareMetalHost) available for reuse during the upgrade.
Expand Down