-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sparse: add max score ratio downscaling for approximate searching #1018
Conversation
@sparknack 🔍 Important: PR Classification Needed! For efficient project management and a seamless review process, it's essential to classify your PR correctly. Here's how:
For any PR outside the kind/improvement category, ensure you link to the associated issue using the format: “issue: #”. Thanks for your efforts and contribution to the community!. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1018 +/- ##
=========================================
+ Coverage 0 73.91% +73.91%
=========================================
Files 0 82 +82
Lines 0 6981 +6981
=========================================
+ Hits 0 5160 +5160
- Misses 0 1821 +1821 |
* make the max score larger than the actual max score, it makes the | ||
* filtering less aggressive, but guarantees the correctness. | ||
* The larger the ratio, the less aggressive the filtering is. | ||
* wand_bm25_max_score_ratio is assigned two functions: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The term frequency part of score of BM25 is:... . but if avgdl changes, the max score changes. this can be used to adjust the max scored used to compute the max score.
- if set to a value greater than 1, ...
- if set to a value less than 1, ...
Signed-off-by: Shawn Wang <[email protected]>
1a462da
to
ab9cb61
Compare
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sparknack, zhengbuqian The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/kind improvement |
Reuse
wand_bm25_max_score_ratio
for approximate searching.Test Result
MSMARCO BM25
HotpotQA BM25