Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The karmada propagates resource very slowly when karmada-controller-manager is just restarted #5978

Open
wenhuwang opened this issue Dec 26, 2024 · 6 comments
Labels
kind/question Indicates an issue that is a support question.

Comments

@wenhuwang
Copy link

wenhuwang commented Dec 26, 2024

Please provide an in-depth description of the question you have:
The karmada propagates new networkpolicy resource to member cluster very slowly when karmada-controller-manager is just restarted, it lasts about 20 minutes.

workqueue depth metrics is as follow:
image

the list of resources managed by karmada is as follow:

  • cluster: 7
  • namespace: 1100
  • workspace: 105
  • ciliumnetworkpolicy: 2500
  • networkpolicy: 3800

What do you think about this question?:
Through the log, it can be roughly located that during the controller startup process, when the cache in the Informer has not been synchronized, the old resource events are treated as Create event types, resulting in the controller receiving a large number of Create events, and i am working on this problem.

Environment:

  • Karmada version: v1.7.2
  • Kubernetes version: v1.22.14
@wenhuwang wenhuwang added the kind/question Indicates an issue that is a support question. label Dec 26, 2024
@wenhuwang
Copy link
Author

/assign

@liangyuanpeng
Copy link
Contributor

Can this be replicated in a higher version of karmada. version 1.7 may too old.

@wenhuwang
Copy link
Author

wenhuwang commented Dec 30, 2024

This problem arises because the controller-runtime library is designed this way by default. the informer startup and cache synchronization process are asynchronous

And i will also try to reproduce this issue with the latest version

@wenhuwang
Copy link
Author

wenhuwang commented Jan 8, 2025

I reproduced it locally using version v1.12.2, and add predicateFunc to the resourcebinding controller to determine the event type.
The number of resources managed by karmada is 175

# kubectl get networkpolicies.networking.k8s.io | wc -l
176

When the karmada-controller-manager restarted the resourcebinding controller reconcile times is 175, and the event type received is all of Create

# kubectl -n karmada-system-demo logs  karmada-controller-manager-bfcb477d6-dgm79  | grep " Sync work of resourceBinding(" | wc -l
175
# kubectl -n karmada-system-demo logs  karmada-controller-manager-bfcb477d6-dgm79  | grep "Create event for ResourceBinding " | wc -l
175

@wenhuwang wenhuwang removed their assignment Jan 8, 2025
@wenhuwang
Copy link
Author

I compared the performance of karmada versions v1.7.2 and v1.12.2. The test data is as follows:

  • cluster : 2
  • workspace: 7
  • namespace: 10
  • networkpolicy: 6000

karmada v1.7.2
When the kamrada controller is just restarted, a new network policy is synchronized to the member cluster in about 8 minutes.
workqueue depth metrics
image

karmada v1.12.2
When the kamrada controller is just restarted, a new network policy is synchronized to the member cluster in about 4 minutes.
workqueue depth metrics
image

From the test results, the performance of the new version is about 50% higher than that of v1.7.2.

@RainbowMango
Copy link
Member

Thanks for sharing the report with us.

Just sharing some info here, we are building a team dedicated to addressing performance issues, and #6031 tracks the efforts planned in release-1.13. Welcome to join us!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/question Indicates an issue that is a support question.
Projects
None yet
Development

No branches or pull requests

3 participants