-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactored reward calculation methods #268
base: refactor-validator
Are you sure you want to change the base?
Conversation
folding/base/reward.py
Outdated
extra_info=batch_rewards_output.extra_info, | ||
) | ||
|
||
async def calculate_final_reward( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see your flow here, but I think it would make more sense to have class OrganicReward(BaseReward)
so you don't have to have any of this baked in logic for checking if a set of data comes from a specific source. It would give us more flexibility, but this is a good start
folding/base/reward.py
Outdated
if job.event["is_organic"]: | ||
organic_multiplier = 10.0 | ||
|
||
return rewards * priority_multiplier * organic_multiplier |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it feels like different reward pipelines should come with formulaic constructions of what their reward should be based on priority and if it's organic or not. Not sure, but this is what my gut is saying. In the case of different challenges, this delineation is much more clear, but when it's just is/is_not organics, I guess it's more difficult.
Is there any merit in doing:
class FoldingReward(BaseReward): ...
class SyntheticFoldingReward(FoldingReward)
class OrganicFoldingReward(FoldingReward)
This way, you can easily call what you need and you don't need to be restricted to looking for tags in the job
event, in case things change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and you can just construct the correct pipeline based on the entry point of the query
folding/base/reward.py
Outdated
async def calculate_final_reward( | ||
self, rewards: torch.Tensor, job: Job | ||
) -> torch.Tensor: | ||
# priority_multiplier = 1 + (job.priority - 1) * 0.1 TODO: Implement priority |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotta implement this once the global job pool is on the horizon
folding/base/reward.py
Outdated
organic_multiplier = 1.0 | ||
if "is_organic" in job.event.keys(): | ||
if job.event["is_organic"]: | ||
organic_multiplier = 10.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should make this a parameter set in the config, same for the priority multiplier
Refactored rewards so that we have a base reward class so that we can implement the rewarding for different tasks in the future