-
Notifications
You must be signed in to change notification settings - Fork 18
[WIP] Feluda Clustering Spec
Denny George edited this page Aug 13, 2024
·
1 revision
sequenceDiagram
Client->>EmbeddingOperator: file_1
EmbeddingOperator->>Client: embedding_1
Client->>EmbeddingOperator: file_2
EmbeddingOperator->>Client: embedding_2
Client->>EmbeddingOperator: file_3
EmbeddingOperator->>Client: embedding_3
Client->>ClusteringOperator: embeddings
ClusteringOperator->>Client: clusters
Client here could be a Feluda Worker or a custom Application we build.
- Run locally for experimentation and debugging
- Run using s3 when in cloud
- lets separate embedding generation and storage from clustering
- embeddings are reusable and can be generated sequential
- might reduce the memory consumption and operational requirement for a clustering operator
- all our current clustering is embedding based, lets namespace it as cluster_embedding_*