Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

faiss_hnsw support INT8 #991

Conversation

cydrain
Copy link
Collaborator

@cydrain cydrain commented Dec 17, 2024

Issue: #977

Signed-off-by: Cai Yudong <[email protected]>
@sre-ci-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cydrain

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Copy link

mergify bot commented Dec 17, 2024

@cydrain 🔍 Important: PR Classification Needed!

For efficient project management and a seamless review process, it's essential to classify your PR correctly. Here's how:

  1. If you're fixing a bug, label it as kind/bug.
  2. For small tweaks (less than 20 lines without altering any functionality), please use kind/improvement.
  3. Significant changes that don't modify existing functionalities should be tagged as kind/enhancement.
  4. Adjusting APIs or changing functionality? Go with kind/feature.

For any PR outside the kind/improvement category, ensure you link to the associated issue using the format: “issue: #”.

Thanks for your efforts and contribution to the community!.

@cydrain
Copy link
Collaborator Author

cydrain commented Dec 17, 2024

/kind improvement

Copy link

codecov bot commented Dec 17, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 74.00%. Comparing base (3c46f4c) to head (9023778).
Report is 272 commits behind head on main.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff            @@
##           main     #991       +/-   ##
=========================================
+ Coverage      0   74.00%   +74.00%     
=========================================
  Files         0       82       +82     
  Lines         0     6948     +6948     
=========================================
+ Hits          0     5142     +5142     
- Misses        0     1806     +1806     

see 82 files with indirect coverage changes

Copy link
Collaborator

@alexanderguzhva alexanderguzhva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm overall, but please confirm about QT_8bit_direct_signed. Thanks.

} else if (dst_data_format == DataFormatEnum::int8) {
knowhere::int8* const dst = reinterpret_cast<knowhere::int8*>(dst_in);
for (size_t i = 0; i < nrows * dim; i++) {
KNOWHERE_THROW_IF_NOT_MSG(src[i] >= std::numeric_limits<knowhere::int8>::min() &&
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is better to use std::numeric_limilts<knowhere::int8>::lowest() here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Alex, what's the difference between min() and lowest() here?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's no difference for this particular use case. But lowest() is better to use, because of the connotations with std::numeric<float>::lowest() (which is -1e+40, the least value) and std::numeric<float>::min() (which is 1e-40, the least representable positive value)

@@ -1327,6 +1350,9 @@ class BaseFaissRegularIndexHNSWFlatNode : public BaseFaissRegularIndexHNSWNode {
} else if (data_format == DataFormatEnum::bf16) {
hnsw_index = std::make_unique<faiss::IndexHNSWSQCosine>(dim, faiss::ScalarQuantizer::QT_bf16,
hnsw_cfg.M.value());
} else if (data_format == DataFormatEnum::int8) {
hnsw_index = std::make_unique<faiss::IndexHNSWSQCosine>(
dim, faiss::ScalarQuantizer::QT_8bit_direct_signed, hnsw_cfg.M.value());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please DO confirm that you want to use QT_8bit_direct_signed here, because the use case is not clear to me. Basically, I can imagine a use case that works with the input data of [0..255] range (QT_8bit_direct), or the traditional QT_8bit that remaps input float values into [0..255] range, but what is the use case for the input data of [-128..127] range? Or is it just the requirement from Milvus?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's the requirement from Milvus, since vespa and qdrant already support Vector_Int8 now

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Alex, I see no obvious difference between min() and lowest(), I prefer to use min() and max() in pair.

@alexanderguzhva
Copy link
Collaborator

@cydrain lgtm
please let me know if you'd like to change min() to lowest() and I'll lgtm this diff in either cases

@alexanderguzhva
Copy link
Collaborator

/lgtm

@sre-ci-robot sre-ci-robot merged commit ca4ba32 into zilliztech:main Dec 19, 2024
14 checks passed
@cydrain cydrain deleted the caiyd_977_faiss_native_support_multi_datatype branch December 19, 2024 01:28
cqy123456 pushed a commit to cqy123456/zilliztech-knowhere that referenced this pull request Dec 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants