Refactored reward calculation methods #268

Sarkosos · 2024-11-24T14:46:57Z

Refactored rewards so that we have a base reward class so that we can implement the rewarding for different tasks in the future

mccrindlebrian · 2024-11-24T15:11:57Z

folding/base/reward.py

+            extra_info=batch_rewards_output.extra_info,
+        )
+
+    async def calculate_final_reward(


I see your flow here, but I think it would make more sense to have class OrganicReward(BaseReward) so you don't have to have any of this baked in logic for checking if a set of data comes from a specific source. It would give us more flexibility, but this is a good start

mccrindlebrian · 2024-11-24T15:19:48Z

folding/base/reward.py

+            if job.event["is_organic"]:
+                organic_multiplier = 10.0
+
+        return rewards * priority_multiplier * organic_multiplier


it feels like different reward pipelines should come with formulaic constructions of what their reward should be based on priority and if it's organic or not. Not sure, but this is what my gut is saying. In the case of different challenges, this delineation is much more clear, but when it's just is/is_not organics, I guess it's more difficult.

Is there any merit in doing:

class FoldingReward(BaseReward): ... class SyntheticFoldingReward(FoldingReward) class OrganicFoldingReward(FoldingReward)

This way, you can easily call what you need and you don't need to be restricted to looking for tags in the job event, in case things change?

and you can just construct the correct pipeline based on the entry point of the query

Sarkosos · 2024-11-24T15:55:04Z

folding/base/reward.py

+    async def calculate_final_reward(
+        self, rewards: torch.Tensor, job: Job
+    ) -> torch.Tensor:
+        # priority_multiplier = 1 + (job.priority - 1) * 0.1 TODO: Implement priority


Gotta implement this once the global job pool is on the horizon

Sarkosos · 2024-11-24T15:55:15Z

folding/base/reward.py

+        organic_multiplier = 1.0
+        if "is_organic" in job.event.keys():
+            if job.event["is_organic"]:
+                organic_multiplier = 10.0


We should make this a parameter set in the config, same for the priority multiplier

Sarkosos added 2 commits November 24, 2024 14:44

Refactored reward calculation methods

56e2111

black

ef48dee

mccrindlebrian reviewed Nov 24, 2024

View reviewed changes

Sarkosos commented Nov 24, 2024

View reviewed changes

Made it so that the reward does not rely on tags

20ce396

mccrindlebrian changed the base branch from staging to refactor-validator January 27, 2025 09:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactored reward calculation methods #268

Refactored reward calculation methods #268

Sarkosos commented Nov 24, 2024 •

edited

Loading

mccrindlebrian Nov 24, 2024

mccrindlebrian Nov 24, 2024

mccrindlebrian Nov 24, 2024

Sarkosos Nov 24, 2024

Sarkosos Nov 24, 2024

Refactored reward calculation methods #268

Are you sure you want to change the base?

Refactored reward calculation methods #268

Conversation

Sarkosos commented Nov 24, 2024 • edited Loading

mccrindlebrian Nov 24, 2024

Choose a reason for hiding this comment

mccrindlebrian Nov 24, 2024

Choose a reason for hiding this comment

mccrindlebrian Nov 24, 2024

Choose a reason for hiding this comment

Sarkosos Nov 24, 2024

Choose a reason for hiding this comment

Sarkosos Nov 24, 2024

Choose a reason for hiding this comment

Sarkosos commented Nov 24, 2024 •

edited

Loading