Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrating the library #14

Open
Wulfheart opened this issue Mar 25, 2022 · 22 comments
Open

Integrating the library #14

Wulfheart opened this issue Mar 25, 2022 · 22 comments

Comments

@Wulfheart
Copy link

I would like to integrate the algorithm into an existing Web API. Therefore I thought I could send the data which is currently in a subfolder of the cache directory directly as JSON. Is there any way to integrate it quite seamlessly as a library?

@EbTech
Copy link
Owner

EbTech commented Mar 25, 2022

Sure, you can make an API that accepts JSON files. What exactly do you want this library to do?

If you want to implement it in Rust, the worldrank-api directory might give you ideas. It's still under construction, but it should work. It doesn't directly use the raw data under cache/, but rather, loads the ratings history under data/. These data files are not in the repository, but must be generated by running multi-skill. The serde library automatically handles the conversion between in-memory structs and JSON files.

@Wulfheart
Copy link
Author

I currently have a PHP application which is able to generate the player rankings for some contests as defined in the cache directory. Now I would like to integrate the ranking calculation as easy as possible into the current system. However, I think the simplest way to achieve this is by building a release, deploy it to the system and run it from there.
You don't happen to have a reference implementation of mmr in Go, Python or PHP (generally any other language than rust. I am still having a hard time to write code productively in rust 😢 )?

@EbTech
Copy link
Owner

EbTech commented Mar 25, 2022

Oh if you just want MMR itself, it's not a lot of code: https://github.com/EbTech/Elo-MMR/blob/master/multi-skill/src/systems/simple_elo_mmr.rs

Would you be able to translate this to your preferred language? Feel free to ask questions.

@Wulfheart
Copy link
Author

I think I should be able to do this. However, is there a difference between EloMMR and SimpleEloMMR? How much do they differ?

@EbTech
Copy link
Owner

EbTech commented Mar 26, 2022

They're the same. The bigger Elo-MMR just has a bunch of (for most purposes) unnecessary features that can be turned on, like approximations that would make it run even faster.

@Wulfheart
Copy link
Author

I decided to use it as a web-api for now as it is simpler to use it that way for now.

Are there any specific values for drift_per_second needed? After a certain threshold (~ 0.5/(246060)) inactive players are preferred.

@EbTech
Copy link
Owner

EbTech commented Mar 27, 2022

Inactive players get a boost when the drift is above 0.5 per day? That's unexpected.

@Wulfheart
Copy link
Author

I made a repo to reproduce it (including steps) with the dataset I am using here.

Below you can see the first ranked player, the display ranking, the last contest and the drift per day. For drifts between 0 and 0.25 (the threshold may be higher) it seems reasonable but for higher than 0.5 it seems off as a player which hasn't been active for some years shouldn't be rated the top player.

{'player': 'Sheath', 'display_ranking': 1910} 2018-02-04 14:40:59 0
======
{'player': 'Sheath', 'display_ranking': 1909} 2018-02-04 14:40:59 0.01
======
{'player': 'Sheath', 'display_ranking': 1907} 2018-02-04 14:40:59 0.05
======
{'player': 'Sheath', 'display_ranking': 1903} 2018-02-04 14:40:59 0.1
======
{'player': 'Sheath', 'display_ranking': 1888} 2018-02-04 14:40:59 0.25
======
{'player': 'johnny_low', 'display_ranking': 1929} 2012-07-03 10:36:01 0.5
======
{'player': 'johnny_low', 'display_ranking': 2024} 2012-07-03 10:36:01 1
======
{'player': 'johnny_low', 'display_ranking': 2088} 2012-07-03 10:36:01 1.4285714285714286
======

Do you have any idea? Did I do something wrong in the rust api or is my methodology wrong?

Thank you in advance.

@EbTech
Copy link
Owner

EbTech commented Mar 28, 2022

Oh hey, I just remembered something that should explain your situation (sorry I've been a bit ill). The decay feature, which I think was pioneered by Glicko, comes with a couple caveats:

  • In the current implementation, rating updates are postponed until the user competes again, so you won't see the decay right away.

  • Since the decay works by increasing a player's sigma, it actually gives future contests (after the period of inactivity) higher weight. The rationale for this is that a player who takes a break from the system, much like a newcomer, has unknown skill. Their decayed rating is the result of the lower bound on their skill having decreased, but behind the scenes the upper bound will also have increased. In practice, you'll probably want to cap sigma to the starting value of 350. You might also decide to decay a player's rating back to the default of 1500, so that a long-inactive player would asymptotically revert to newcomer status. A sample implementation of this is provided in https://github.com/EbTech/Elo-MMR/blob/master/multi-skill/src/systems/common/mod.rs#L26 which might eventually make it into the core update code.

If you disagree with this uncertainty-based approach altogether, you can hack in a rating penalty instead of messing with sigma.

@Wulfheart
Copy link
Author

Thanks for your thorough explanation.
I don't disagree completely with the uncertainty based approach I don't think it is viable in my situation because I want the rating to decay automatically after a time t even if the player hasn't participated in a contest.

Currently I use the experiment.eval() to to get the ranking. Where can I hack in the rating penalty?

@EbTech
Copy link
Owner

EbTech commented Mar 29, 2022

You can use the uncertainty-based decay too, using the same formula to update the rating for any given time. It's just that the current implementation doesn't have a way to ask for an updated rating at a particular time (I may change the design to support that later). If you want to add that yourself, I could imagine several places and haven't thought carefully about which is best. Whenever you retrieve the ratings, if you also retrieve the player's last update time, you can compare that to the current time and compute an adjusted rating that way.

@stephankokkas
Copy link

Oh hey, I just remembered something that should explain your situation (sorry I've been a bit ill). The decay feature, which I think was pioneered by Glicko, comes with a couple caveats:

  • In the current implementation, rating updates are postponed until the user competes again, so you won't see the decay right away.
  • Since the decay works by increasing a player's sigma, it actually gives future contests (after the period of inactivity) higher weight. The rationale for this is that a player who takes a break from the system, much like a newcomer, has unknown skill. Their decayed rating is the result of the lower bound on their skill having decreased, but behind the scenes the upper bound will also have increased. In practice, you'll probably want to cap sigma to the starting value of 350. You might also decide to decay a player's rating back to the default of 1500, so that a long-inactive player would asymptotically revert to newcomer status. A sample implementation of this is provided in https://github.com/EbTech/Elo-MMR/blob/master/multi-skill/src/systems/common/mod.rs#L26 which might eventually make it into the core update code.

If you disagree with this uncertainty-based approach altogether, you can hack in a rating penalty instead of messing with sigma.

I have a tendency to dislike this approach - only because if a player is absent from competition for a while it is unfair to assume that their skill will decrease. A possible reason for why a competitor is not competing may be to practise a particular skill or improve in an aspect of the game / task. I am still thinking of a solution to this problem.

@EbTech
Copy link
Owner

EbTech commented Apr 3, 2022

@stephankokkas in that case, wouldn't the uncertainty-based approach be more suitable? That way, the display rating is temporarily lowered, but is quickly raised when the player returns. Maybe you specifically dislike mu returning to 1500? In that case, you might leave mu untouched, but gradually increase sigma^2 towards an asymptotic cap such as 350. If the community gravitates towards a particular approach, I might make that the default; for now, there appears to be room for competing philosophies.

@Wulfheart
Copy link
Author

I have a tendency to dislike this approach - only because if a player is absent from competition for a while it is unfair to assume that their skill will decrease. A possible reason for why a competitor is not competing may be to practise a particular skill or improve in an aspect of the game / task. I am still thinking of a solution to this problem.

@stephankokkas I am unable to grasp the concept @EbTech suggests. However, lowering the score manually incentivizes users to play more often.

@Wulfheart
Copy link
Author

@stephankokkas have you come up with a better solution?

@stephankokkas
Copy link

Without changing the code, yes. I was able to acquire training data of my competitors and incorporated it that way into the rating of each player.

@Wulfheart
Copy link
Author

I don’t understand it completely. How can I integrate it? Is there a rating decay?

@where-is-paul
Copy link
Collaborator

To expand more on Aram's solution:

The "display rating" in our system is calculated as true_rating - 3 * uncertainty -- in statistical terms the true rating is the average of some distribution and the uncertainty is the standard deviation.

Aram is suggesting steadily increasing the "uncertainty" over time. On the front-end, this looks like the rating is gradually going down over time.

Increasing the uncertainty means that the "true rating" in our system stays the same, but the system is less certain about the spread around the true rating. When users participate in a contest, their performance uncertainty is decreased by the system because we get more information about them. This means that if their skill has not decayed, then they will return to their original rating quicky.

Hope this helps,
-- Paul

@stephankokkas
Copy link

@where-is-paul would you be able to explain how to enable this feature when rating? Is it a tag that can be used in the command line? Or perhaps something that needs adjusting in the source code?

Thanks

@Wulfheart
Copy link
Author

Some commits happened recently on this repository but I don’t know what exactly they do. At least some APIs have changed. For example some in the SimpleEloMMR.
@where-is-paul did you refer to these changes?

@EbTech
Copy link
Owner

EbTech commented Jul 8, 2022

@stephankokkas conservative display ratings are computed by https://github.com/EbTech/Elo-MMR/blob/b513efe/multi-skill/src/systems/common/player.rs#L15, and appear in the all_players.csv file that's produced if you follow the README instructions. Keeping in mind the caveats I mentioned above, if you still want to enable time-based decay, the relevant parameter is EloMMR::drift_per_sec.

@Wulfheart Unless I'm forgetting something, the old APIs should still work. A bit of backward-compatibility was lost when minor changes were made to the file format. We did add a new way to run the rating system using config files (briefly mentioned in README), but that's not quite ready for public primetime yet. We'll document it better when we intend for more people to use it.

@Wulfheart
Copy link
Author

@EbTech just to be sure: The drift per second is only applied, when the player participates in another game, isn't it?
So the easiest solution would be to add a user facing decay and let the library itself handle the calculation without this decay?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants