docs(models): clarify open-source model guidance and system prompt usage#4276
docs(models): clarify open-source model guidance and system prompt usage#4276aworki wants to merge 1 commit intobrowser-use:mainfrom
Conversation
|
|
Greptile SummaryThis PR adds a short documentation subsection in The content is useful and directly addresses the concern raised in issue #4225. However, there are two structural concerns worth addressing before merging:
Confidence Score: 4/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[User selects an open-source Browser Use model] --> B{Is it a fine-tuned\nBrowser Use model?}
B -- Yes --> C[Keep Browser Use\nsystem prompt enabled]
B -- No --> D[Keep Browser Use system prompt\n+ run smoke task]
D --> E{Action formatting\nstable?}
E -- Yes --> F[Proceed with model]
E -- No --> G[Add explicit action-schema\nexamples to prompt]
C --> F
Last reviewed commit: 01a5fa7 |
| #### Open-source Browser Use models (e.g. `browser-use/bu-30b-a3b-preview`) | ||
|
|
||
| <Info> | ||
| If you are using an open-source Browser Use fine-tuned model, keep the Browser Use system prompt enabled. The fine-tuning and prompt are designed to work together for stable action formatting and tool use. | ||
| </Info> | ||
|
|
||
| For non-fine-tuned/general models, use the same Browser Use system prompt and validate behavior with a short smoke task first. If action formatting is unstable, add explicit action-schema examples in your prompt (as noted in the Qwen section below). |
There was a problem hiding this comment.
Section placement creates misleading association
This new subsection is nested as #### under the ### Browser Use section, which is specifically about the commercial ChatBrowserUse() cloud API. Open-source models like browser-use/bu-30b-a3b-preview are self-hosted (e.g. via Ollama or vLLM) and accessed through a different client class (e.g. ChatOllama or ChatOpenAI with a custom base_url), not through ChatBrowserUse().
Placing this here could cause users to incorrectly assume they can load the open-source model via ChatBrowserUse(). Consider moving this guidance to a standalone ## section (similar to the Ollama and Groq sections), or adding it as a subsection of the Ollama section, since that's the most common local inference backend for these models.
Additionally, no code example is provided, while every other section in this file includes one. Without an example, users have no reference for which client class to use or how to point it at a local inference server:
from browser_use import Agent, ChatOllama # or ChatOpenAI with base_url
# After: ollama pull browser-use/bu-30b-a3b-preview
llm = ChatOllama(model="browser-use/bu-30b-a3b-preview")
agent = Agent(task="...", llm=llm)| If you are using an open-source Browser Use fine-tuned model, keep the Browser Use system prompt enabled. The fine-tuning and prompt are designed to work together for stable action formatting and tool use. | ||
| </Info> | ||
|
|
||
| For non-fine-tuned/general models, use the same Browser Use system prompt and validate behavior with a short smoke task first. If action formatting is unstable, add explicit action-schema examples in your prompt (as noted in the Qwen section below). |
There was a problem hiding this comment.
Cross-reference does not point to actionable guidance
The text says "add explicit action-schema examples in your prompt (as noted in the Qwen section below)", but the Qwen section only describes the problem (incorrect action schema format) — it doesn't actually show how to add action-schema examples to a prompt. A user following this reference will not find the concrete guidance they need.
Either:
- Make this section self-contained by including an example of what an action-schema prompt addition looks like, or
- Reword to avoid implying the Qwen section provides a how-to, e.g. "add concrete examples of the correct action format to your prompt (see the Qwen section for examples of the malformed schema to guard against)."
| For non-fine-tuned/general models, use the same Browser Use system prompt and validate behavior with a short smoke task first. If action formatting is unstable, add explicit action-schema examples in your prompt (as noted in the Qwen section below). | |
| For non-fine-tuned/general models, use the same Browser Use system prompt and validate behavior with a short smoke task first. If action formatting is unstable, add concrete examples of the correct action format to your prompt (e.g., `[{"navigate": {"url": "google.com"}}]` rather than `[{"navigate": "google.com"}]`). |
|
Hi maintainers — gentle follow-up. I currently see all workflow checks green except license/cla (still pending).\n\nCould you advise whether there is any additional contributor-side step I should complete to satisfy CLA for this PR? I’ll do it immediately. Thank you! |
Summary
Adds explicit guidance for open-source Browser Use models in the Supported Models docs.
What changed
docs/supported-models.mdx:browser-use/bu-30b-a3b-preview)Why
Issue #4225 asks whether open-source fine-tuned BU models should be used with the Browser Use system prompt, and how that differs from using general cloud models. This adds direct guidance where users choose models.
Closes #4225
Summary by cubic
Adds a new subsection to Supported Models with guidance for open-source Browser Use models, addressing #4225. Keep the Browser Use system prompt for fine-tuned models; for general models, run a quick smoke test and add action-schema examples if action formatting is unstable.
Written for commit 01a5fa7. Summary will update on new commits.