-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
faiss_hnsw support INT8 #991
faiss_hnsw support INT8 #991
Conversation
Signed-off-by: Cai Yudong <[email protected]>
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cydrain The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@cydrain 🔍 Important: PR Classification Needed! For efficient project management and a seamless review process, it's essential to classify your PR correctly. Here's how:
For any PR outside the kind/improvement category, ensure you link to the associated issue using the format: “issue: #”. Thanks for your efforts and contribution to the community!. |
/kind improvement |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #991 +/- ##
=========================================
+ Coverage 0 74.00% +74.00%
=========================================
Files 0 82 +82
Lines 0 6948 +6948
=========================================
+ Hits 0 5142 +5142
- Misses 0 1806 +1806 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm overall, but please confirm about QT_8bit_direct_signed
. Thanks.
} else if (dst_data_format == DataFormatEnum::int8) { | ||
knowhere::int8* const dst = reinterpret_cast<knowhere::int8*>(dst_in); | ||
for (size_t i = 0; i < nrows * dim; i++) { | ||
KNOWHERE_THROW_IF_NOT_MSG(src[i] >= std::numeric_limits<knowhere::int8>::min() && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is better to use std::numeric_limilts<knowhere::int8>::lowest()
here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Alex, what's the difference between min() and lowest() here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's no difference for this particular use case. But lowest()
is better to use, because of the connotations with std::numeric<float>::lowest()
(which is -1e+40, the least value) and std::numeric<float>::min()
(which is 1e-40, the least representable positive value)
@@ -1327,6 +1350,9 @@ class BaseFaissRegularIndexHNSWFlatNode : public BaseFaissRegularIndexHNSWNode { | |||
} else if (data_format == DataFormatEnum::bf16) { | |||
hnsw_index = std::make_unique<faiss::IndexHNSWSQCosine>(dim, faiss::ScalarQuantizer::QT_bf16, | |||
hnsw_cfg.M.value()); | |||
} else if (data_format == DataFormatEnum::int8) { | |||
hnsw_index = std::make_unique<faiss::IndexHNSWSQCosine>( | |||
dim, faiss::ScalarQuantizer::QT_8bit_direct_signed, hnsw_cfg.M.value()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please DO confirm that you want to use QT_8bit_direct_signed
here, because the use case is not clear to me. Basically, I can imagine a use case that works with the input data of [0..255]
range (QT_8bit_direct
), or the traditional QT_8bit
that remaps input float values into [0..255]
range, but what is the use case for the input data of [-128..127]
range? Or is it just the requirement from Milvus?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's the requirement from Milvus, since vespa and qdrant already support Vector_Int8 now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Alex, I see no obvious difference between min() and lowest(), I prefer to use min() and max() in pair.
@cydrain lgtm |
/lgtm |
Signed-off-by: Cai Yudong <[email protected]>
Issue: #977