-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Resume algorithm execution #28
Comments
It would be good to provide this function as an option.
|
Yes. I would propose something like: $algo = new Kmeans\Algorithm:(new Kmeans\RandomInitialization());
$result = $algo->clusterize($points, $nbClusters);
$serialized = serialize($result);
// later...
$previousRun = unserialize($serialized);
$result = $previousRun->resume($newPoints); |
looks good 👍 |
I've been thinking about a result object for <?php
namespace Bdelespierre\Kmeans\Interfaces;
interface ClusterizationResultInterface extends \Serializable
{
public function hasReachedConvergence(): bool;
/**
* @return int<0, max>
*/
public function iterationsCount(): int;
public function getClusters(): ClusterCollectionInterface;
public function resume(PointCollectionInterface $newPoints): self;
} |
Sorry for checking late. (I confirmed that it was committed to pr.) I think it's fine. But I think we'll have to do some more work to be more confident about the interface design. |
It's not implemented in #27. I plan to implement that later |
I believe it would be nice to be able to resume algorithm execution after its completion. It could be useful as new points are being added so previous iterations don't need to be re-run again.
Example: I have clustered my 100 000 users into 5 clusters. Since the last clustering, 100 new users have been added. Most of them are probably already very close to the existing clusters' centroids. Hence, I should be able to resume clustering the same dataset PLUS the new users to save time.
The text was updated successfully, but these errors were encountered: