Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get total variation (Elbow method) #31

Open
bdelespierre opened this issue Sep 11, 2021 · 4 comments
Open

Get total variation (Elbow method) #31

bdelespierre opened this issue Sep 11, 2021 · 4 comments
Milestone

Comments

@bdelespierre
Copy link
Owner

bdelespierre commented Sep 11, 2021

In order to find the best value for K (the number of clusters), it would be nice to get the variance of the distance of clustered points to their cluster's centroid.

Inspired by https://www.youtube.com/watch?v=4b5d3muPQmA
Also see https://en.wikipedia.org/wiki/Elbow_method_(clustering)

I also believe the current v3 implementation of RandomInitialization is wrong 🤷‍♂️

Proposed change

$result = (new Kmeans\Algorithm($init))->clusterize($points, $K);
echo $result->getTotalVariance();
@bdelespierre bdelespierre changed the title Get total variation Get total variation to figure out elbow plot Sep 11, 2021
@bdelespierre bdelespierre changed the title Get total variation to figure out elbow plot Get total variation (Elbow method) Sep 11, 2021
@bdelespierre
Copy link
Owner Author

@battlecook
Copy link
Contributor

Implementing the elbow method is quite expensive to implement.
getTotalVariance() is correct for implementing the elbow method.
But implementing the elbow method requires more implementations.
As you can see, kmeans has different results depending on the initial centroid position.
This means that the elbow position can be different for each run.
We also need a policy for averaging that elbow.

@bdelespierre
Copy link
Owner Author

From this ticket's scope, calculating the Elbow point is someone else's problem. We're just providing the variance here 😉

@battlecook
Copy link
Contributor

Oh, that's right. Then I understood. great.

@battlecook battlecook mentioned this issue Sep 22, 2021
Draft
@bdelespierre bdelespierre added this to the v3.0 milestone Mar 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants