Online Real Time Semantic Search using Transformers and HNSW nearest neighbour search.
Uses Sentence Transformers (https://github.com/UKPLab/sentence-transformers) as embedding model and HNSWlib (https://github.com/nmslib/hnswlib) as approximate nearest neighbour search index based on cosine similarity.
The aim is to make it "Online" - to be able to add new documents in parallel with querying.
Check out notebooks/demo.ipynb for usage!
- Make Data Ingestor class to handle any document type
- Make arrangements for fine tuning
- Make it "Online" - Add Index management(cycling backups, switching to fresh index)
- Make better demo notebook
- Make arXiv specific ranking scheme - embed documents at the sentence level
- MAKE BETTER README