-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Multidimensional arrays and diversity clustering #35
Comments
Hey, @LarryBarker thanks for submitting an issue,
You may use a callback function to tap into the algorithm execution 👍
I haven't implemented weighted k-means yet. Development effort is now focused on a new v3 version, designed to be easier to extend/override with your own custom algorithms. Have a look here V3
You can achieve this by assigning arbitrary data to points 👍 Thank you for using PHP Kmeans and don't hesitate to let us know if you have any feature requests or if you encounter any bugs. Have a nice day |
@bdelespierre Thanks for the quick reply! I realized after I posted I could attach data to points, so thank you for confirming that. It's good to know that weighted kmeans is something you have thought about. I assume it is doable? Any resources you might have to help me implement my own? |
There is unfortunately very little litterature on the topic so I just assumed this is not what the users wanted. That being said, finding the centroid of a group of weighted points is a piece of cake. But I'm not quite sure how to interpret the resuts... |
Hello, thank you for sharing this package. I'm hoping to use it to help group users into diverse groups based on socioeconomic factors like race, gender, age, etc. Our dataset contains 20 factors that need to be taken into consideration. Have you used this to solve such a problem?
I've started some preliminary testing, and seem to be getting results but I can't tell what is happening behind the scenes. Furthermore, I would like to be able to weight each factor. For example, race may be the most important factor in some cases, while gender may be in others.
Here is what the data looks like:
The numerical representation for each possible value is what we store:
I'm curious as well, after the clustering is performed, is there anyway to retrieve the original key for the data? This is needed because I need to know which users are in each cluster.
If this is not the appropriate channel for this type of question, or beyond the scope of the repo, please let me know. I certainly appreciate any feedback you may have. Thank you :)
The text was updated successfully, but these errors were encountered: