Allow addition models? #1

wbste · 2024-10-31T21:12:18Z

wbste
Oct 31, 2024

Any chance that a model like Molmo (that can "point") could be used as a self-host solution? I believe vLLM could be used as an endpoint?

https://molmo.allenai.org/blog

theredsix · 2024-11-01T04:50:22Z

theredsix
Nov 1, 2024
Maintainer

I think we definitely would like to support local models via API on either vLLM or ollama in the future.

From a technical standpoint, the LLM needs to:

Understand the UI
Be able to plan actions that move the webpage state closer to the goal
Translate the actions from 2 to coordinates for mouse actions

From my experiments, all frontier models from GPT-4o to Gemini are capable of 1 & 2 out of the box, and even cost-optimized models like GPT-4o-mini can be fine-tuned to understand 1&2 as well. I think it's an open question whether the 72b version would work well in this format but it definitely be worth a shot. The barrier to running this experiment would be implementing a vLLMPlanner. Would you be interested in crafting vLLMPlanner using AnthropicPlanner as a guide?

1 reply

wbste Nov 1, 2024
Author

That's likely above my skillset, but I'll take a look!

theredsix · 2024-11-01T16:44:57Z

theredsix
Nov 1, 2024
Maintainer

@wbste Molmo definitely does look promising.

1 reply

wbste Nov 1, 2024
Author

Yeah molmo is the first open model I found that should provide reliable coordinates for these sort of use cases. I also saw this..not sure if that opens more doors to other models or not: https://github.com/microsoft/OmniParser

theredsix · 2024-11-05T16:58:31Z

theredsix
Nov 5, 2024
Maintainer

We're starting to gather data for a fine-tune of Molmo!

https://github.com/theredsix/cerebellum/tree/mind2web/training

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow addition models? #1

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Allow addition models? #1

wbste Oct 31, 2024

Replies: 3 comments · 2 replies

theredsix Nov 1, 2024 Maintainer

wbste Nov 1, 2024 Author

theredsix Nov 1, 2024 Maintainer

wbste Nov 1, 2024 Author

theredsix Nov 5, 2024 Maintainer

wbste
Oct 31, 2024

Replies: 3 comments 2 replies

theredsix
Nov 1, 2024
Maintainer

wbste Nov 1, 2024
Author

theredsix
Nov 1, 2024
Maintainer

wbste Nov 1, 2024
Author

theredsix
Nov 5, 2024
Maintainer