Chat UI model dropdown is ignored by the server #5

Closed
opened 2026-04-23 10:24:36 +00:00 by rawan · 3 comments
Member

Summary

The model selector dropdown in the chat UI (#chatModel) has no effect on which LLM is used. The selection is sent to the server but silently discarded.

Repro

  1. Open the Hero Agent UI (Chat tab).
  2. Pick any model from the dropdown (e.g. google/gemini-3-flash).
  3. Send a message.
  4. Observe: the agent always uses the server's default_model regardless of selection.

Root cause

The frontend correctly sends model in the request body:

// crates/hero_agent_ui/static/js/dashboard.js:274-275
var model = document.getElementById('chatModel').value;
var body = { message: msg, model: model, stream: true };

The server deserializes it into ChatRequest.model:

// crates/hero_agent_server/src/routes.rs:582
#[serde(default)]
model: Option<String>,

…but then never reads request.model. The chat handler only forwards user_ai_config (BYOK triple) into agent.handle_message:

// crates/hero_agent_server/src/routes.rs:649-651
agent
    .handle_message(&message, &user_sid, &context, conv_id, true, user_mcp_key.as_deref(), user_ai_config)
    .await

So when no BYOK config is supplied, the agent falls through to LlmClient::default_model (set from aibroker_models[0]openrouter_models[0]groq_models[0] in crates/hero_agent/src/llm_client.rs:156-162), and the dropdown choice is dropped on the floor.

Suggested fix

Thread request.model into handle_message (or the LlmOptions it builds) so that when no BYOK config is active, the selected model overrides default_model. BYOK should still take precedence when present.

Affected files

  • crates/hero_agent_server/src/routes.rs (chat handler)
  • crates/hero_agent/src/agent.rs (handle_message signature + LlmOptions::model)
## Summary The model selector dropdown in the chat UI (`#chatModel`) has no effect on which LLM is used. The selection is sent to the server but silently discarded. ## Repro 1. Open the Hero Agent UI (`Chat` tab). 2. Pick any model from the dropdown (e.g. `google/gemini-3-flash`). 3. Send a message. 4. Observe: the agent always uses the server's `default_model` regardless of selection. ## Root cause The frontend correctly sends `model` in the request body: ```js // crates/hero_agent_ui/static/js/dashboard.js:274-275 var model = document.getElementById('chatModel').value; var body = { message: msg, model: model, stream: true }; ``` The server deserializes it into `ChatRequest.model`: ```rust // crates/hero_agent_server/src/routes.rs:582 #[serde(default)] model: Option<String>, ``` …but then never reads `request.model`. The `chat` handler only forwards `user_ai_config` (BYOK triple) into `agent.handle_message`: ```rust // crates/hero_agent_server/src/routes.rs:649-651 agent .handle_message(&message, &user_sid, &context, conv_id, true, user_mcp_key.as_deref(), user_ai_config) .await ``` So when no BYOK config is supplied, the agent falls through to `LlmClient::default_model` (set from `aibroker_models[0]` → `openrouter_models[0]` → `groq_models[0]` in `crates/hero_agent/src/llm_client.rs:156-162`), and the dropdown choice is dropped on the floor. ## Suggested fix Thread `request.model` into `handle_message` (or the `LlmOptions` it builds) so that when no BYOK config is active, the selected model overrides `default_model`. BYOK should still take precedence when present. ## Affected files - `crates/hero_agent_server/src/routes.rs` (chat handler) - `crates/hero_agent/src/agent.rs` (`handle_message` signature + `LlmOptions::model`)
rawan self-assigned this 2026-04-23 13:05:30 +00:00
Author
Member

Implementation Spec: Wire ChatRequest.model through to the agent

Objective

Make the Chat UI's model dropdown (#chatModel) actually select the LLM used for a request. Thread ChatRequest.model from the HTTP handler through Agent::handle_message down to the LlmOptions::model used by both the quick-response and tool-loop code paths, while preserving BYOK precedence and all existing behavior when no model is sent.

Requirements

  • ChatRequest.model, when non-empty and non-"auto", MUST override the server-side LlmClient::default_model for the Chat UI code path (POST /api/chat).
  • BYOK (user_ai_config) MUST continue to take precedence over a UI-selected model. If both are present, BYOK wins.
  • If ChatRequest.model is None, empty string, or the literal "auto" sentinel (the dropdown ships auto as its default option), behavior MUST be identical to today (complexity-based select_model in the tool path, Lightweight in the quick-response path).
  • The new behavior MUST flow into both branches of handle_message: the quick_response path (triage != Tools) AND the agent_loop path (triage == Tools).
  • Agent::handle_message gets a new selected_model: Option<&str> parameter. All 5 non-chat call sites (whatsapp.rs, telegram.rs, cli.rs, routes.rs:279 rpc_chat, routes.rs:984 voice) MUST be updated to pass None — they have no UI dropdown and MUST keep today's behavior.
  • The model field on ChatRequest MUST remain optional and backward-compatible (clients that omit it still work).
  • No new tests are required (the crate has none for this handler), but the change must not break the existing 7 unit tests in routes.rs or any llm_client.rs tests.

Files to modify

  • crates/hero_agent/src/agent.rs — add selected_model: Option<&str> param to handle_message; thread into quick_response and into the model selection block before agent_loop; update quick_response signature to accept it. BYOK still wins.
  • crates/hero_agent_server/src/routes.rs — in chat(), normalize request.model (trim, drop empty and "auto") and pass it into handle_message. Update the 2 other call sites in this file (rpc_chat ~L279, voice handler ~L984) to pass None.
  • crates/hero_agent/src/channels/whatsapp.rs — add trailing None arg.
  • crates/hero_agent/src/channels/telegram.rs — add trailing None arg.
  • crates/hero_agent/src/channels/cli.rs — add trailing None arg.

No files to create. No schema/DB changes. No frontend changes (the JS already sends model).

Step-by-step implementation plan

Step 1 — Extend Agent::handle_message signature and thread model through (crates/hero_agent/src/agent.rs)

Dependencies: none.

  1. At crates/hero_agent/src/agent.rs line 121, add a new parameter at the end of handle_message:
    selected_model: Option<&str>,
    
  2. At line 186, change the quick_response call to also pass selected_model:
    let response = self.quick_response(&input, &conv_id, user_sid, stream, user_ai_config.as_ref(), selected_model).await?;
    
  3. Replace the model selection block at lines 273–281 with BYOK-first, then UI-selected, then complexity-based:
    let model = if let Some(ref user_config) = user_ai_config {
        tracing::info!("[MODEL] Using user AI model from BYOK config: {}", user_config.model);
        user_config.model.clone()
    } else if let Some(m) = selected_model {
        tracing::info!("[MODEL] Using UI-selected model: {}", m);
        m.to_string()
    } else {
        tracing::debug!("[MODEL] No BYOK or UI selection, selecting model by complexity");
        let tier = LlmClient::classify_complexity(&input);
        self.llm.select_model(tier)
    };
    
    The existing call to agent_loop at line 284 already passes &model and does not need a signature change — agent_loop uses model as-is in LlmOptions { model: Some(model.to_string()), ... } at lines 555–560.

Step 2 — Extend quick_response signature and thread model through (crates/hero_agent/src/agent.rs)

Dependencies: Step 1.

  1. At line 325, add selected_model: Option<&str> as the last parameter of quick_response.
  2. Replace the model-selection block at lines 370–377 with:
    let model_name = if let Some(config) = user_ai_config {
        tracing::info!("[QUICK_RESPONSE_MODEL] Using user AI model from BYOK config: {}", config.model);
        config.model.clone()
    } else if let Some(m) = selected_model {
        tracing::info!("[QUICK_RESPONSE_MODEL] Using UI-selected model: {}", m);
        m.to_string()
    } else {
        tracing::debug!("[QUICK_RESPONSE_MODEL] No BYOK or UI selection, using Lightweight model");
        self.llm.select_model(ModelTier::Lightweight)
    };
    
    The rest of quick_response (building LlmOptions { model: Some(model_name.clone()), ... }) stays unchanged.

Step 3 — Update chat handler to pass the UI model (crates/hero_agent_server/src/routes.rs)

Dependencies: Step 1.

  1. In the chat() function, after line 637 (after the user_ai_config match) and before the tokio::spawn at line 643, add:
    // Normalize UI-selected model: treat empty string and the "auto" sentinel as "no selection".
    let selected_model = request.model.as_ref().and_then(|m| {
        let t = m.trim();
        if t.is_empty() || t.eq_ignore_ascii_case("auto") { None } else { Some(t.to_string()) }
    });
    
  2. Update the handle_message call at lines 649–651 to pass selected_model.as_deref() as the new trailing argument.

Step 4 — Update the other call sites in routes.rs to pass None (crates/hero_agent_server/src/routes.rs)

Dependencies: Step 1.

  1. Line 279 (rpc_chat): change to
    agent.handle_message(&message, &user_sid, &context, conv_id, false, None, None, None).await
    
  2. Line 984 (voice handler): change to
    .handle_message(&transcript, user_sid, context, conv_id, false, None, None, None)
    

Step 5 — Update channel call sites to pass None (crates/hero_agent/src/channels/*.rs)

Dependencies: Step 1.

  1. crates/hero_agent/src/channels/whatsapp.rs line 267: add a trailing None.
  2. crates/hero_agent/src/channels/telegram.rs line 204: add a trailing None.
  3. crates/hero_agent/src/channels/cli.rs line 63: add a trailing None.

Step 6 — Build and verify

Dependencies: Steps 1–5.

  1. From /home/rawan/codescalers/hero_agent, run cargo build -p hero_agent -p hero_agent_server to confirm all 6 call sites compile.
  2. Run cargo test -p hero_agent_server to confirm the 7 existing unit tests still pass.
  3. Manual smoke (optional but recommended): start the server, open the Chat tab, pick a non-default model from the dropdown, send a message, and confirm logs show [MODEL] Using UI-selected model: .... Confirm picking auto (or a config where models list is empty) falls back to the existing complexity-based path with [MODEL] No BYOK or UI selection, selecting model by complexity.

Acceptance criteria

  • Agent::handle_message has a new selected_model: Option<&str> parameter at the end of its signature.
  • Agent::quick_response has a new selected_model: Option<&str> parameter at the end of its signature.
  • Chat route (POST /api/chat) passes the request's model (trimmed, with empty and "auto" treated as None) into handle_message.
  • BYOK user_ai_config.model takes precedence over selected_model in both handle_message (tool path) and quick_response (chitchat path).
  • When neither BYOK nor a UI selection is present, model selection falls back to the existing complexity-based behavior (tool path) or Lightweight (quick-response path).
  • All 5 non-chat call sites (rpc_chat, voice handler, whatsapp, telegram, cli) pass None for selected_model and their observable behavior is unchanged.
  • cargo build -p hero_agent -p hero_agent_server succeeds.
  • cargo test -p hero_agent_server passes (7/7 existing tests).
  • No frontend changes; dashboard.js still sends { message, model, stream } as it does today.

Notes / Edge cases

  • The frontend sends the literal string auto when the models list is empty (see dashboard.js:521models = [cfg.defaultModel || 'auto']). That string is NOT a real model name on any provider, so step 3's normalization MUST strip it. Failing to strip "auto" would result in all providers rejecting the request.
  • ChatRequest.model already has #[serde(default)] on line 579–582 of routes.rs — no schema change is needed, clients omitting model continue to work.
  • agent_loop already accepts model: &str and builds LlmOptions { model: Some(model.to_string()), ... } — no change needed there. The one-line options.model lookup inside LlmClient::chat_completion* already uses the passed model and falls back to default_model only when options.model is None.
  • LlmClient::default_model is derived in constructor (llm_client.rs:156–162) as aibroker_models[0] → openrouter_models[0] → groq_models[0] → "claude-sonnet-4.5". This continues to be the ultimate fallback; we only affect it when a UI selection is actively provided.
  • Do NOT validate the UI-selected model against the server's known models list. The provider chain (aibroker → openrouter → groq) already handles unknown model names by failing over. Validating in the handler would couple the handler to config and break the existing loose-coupling between UI dropdown population and request handling.
  • Logging: the [MODEL] / [QUICK_RESPONSE_MODEL] tracing prefixes already exist in today's code — keep the same style for the new branch so existing log-watching dashboards keep working.
  • BYOK precedence rule is preserved exactly as it is today: if the frontend proxy injects user_ai_provider + user_ai_api_key + user_ai_model, the user_ai_config is built (routes.rs:622–637) and wins over any selected_model. This matches the issue's requirement: "BYOK should still take precedence when present."
# Implementation Spec: Wire `ChatRequest.model` through to the agent ## Objective Make the Chat UI's model dropdown (`#chatModel`) actually select the LLM used for a request. Thread `ChatRequest.model` from the HTTP handler through `Agent::handle_message` down to the `LlmOptions::model` used by both the quick-response and tool-loop code paths, while preserving BYOK precedence and all existing behavior when no model is sent. ## Requirements - `ChatRequest.model`, when non-empty and non-"auto", MUST override the server-side `LlmClient::default_model` for the Chat UI code path (`POST /api/chat`). - BYOK (`user_ai_config`) MUST continue to take precedence over a UI-selected model. If both are present, BYOK wins. - If `ChatRequest.model` is `None`, empty string, or the literal `"auto"` sentinel (the dropdown ships `auto` as its default option), behavior MUST be identical to today (complexity-based `select_model` in the tool path, `Lightweight` in the quick-response path). - The new behavior MUST flow into both branches of `handle_message`: the `quick_response` path (triage != Tools) AND the `agent_loop` path (triage == Tools). - `Agent::handle_message` gets a new `selected_model: Option<&str>` parameter. All 5 non-chat call sites (`whatsapp.rs`, `telegram.rs`, `cli.rs`, `routes.rs:279` rpc_chat, `routes.rs:984` voice) MUST be updated to pass `None` — they have no UI dropdown and MUST keep today's behavior. - The `model` field on `ChatRequest` MUST remain optional and backward-compatible (clients that omit it still work). - No new tests are required (the crate has none for this handler), but the change must not break the existing 7 unit tests in `routes.rs` or any `llm_client.rs` tests. ## Files to modify - `crates/hero_agent/src/agent.rs` — add `selected_model: Option<&str>` param to `handle_message`; thread into `quick_response` and into the `model` selection block before `agent_loop`; update `quick_response` signature to accept it. BYOK still wins. - `crates/hero_agent_server/src/routes.rs` — in `chat()`, normalize `request.model` (trim, drop empty and `"auto"`) and pass it into `handle_message`. Update the 2 other call sites in this file (`rpc_chat` ~L279, voice handler ~L984) to pass `None`. - `crates/hero_agent/src/channels/whatsapp.rs` — add trailing `None` arg. - `crates/hero_agent/src/channels/telegram.rs` — add trailing `None` arg. - `crates/hero_agent/src/channels/cli.rs` — add trailing `None` arg. No files to create. No schema/DB changes. No frontend changes (the JS already sends `model`). ## Step-by-step implementation plan ### Step 1 — Extend `Agent::handle_message` signature and thread model through (crates/hero_agent/src/agent.rs) Dependencies: none. 1. At `crates/hero_agent/src/agent.rs` line 121, add a new parameter at the end of `handle_message`: ```rust selected_model: Option<&str>, ``` 2. At line 186, change the `quick_response` call to also pass `selected_model`: ```rust let response = self.quick_response(&input, &conv_id, user_sid, stream, user_ai_config.as_ref(), selected_model).await?; ``` 3. Replace the `model` selection block at lines 273–281 with BYOK-first, then UI-selected, then complexity-based: ```rust let model = if let Some(ref user_config) = user_ai_config { tracing::info!("[MODEL] Using user AI model from BYOK config: {}", user_config.model); user_config.model.clone() } else if let Some(m) = selected_model { tracing::info!("[MODEL] Using UI-selected model: {}", m); m.to_string() } else { tracing::debug!("[MODEL] No BYOK or UI selection, selecting model by complexity"); let tier = LlmClient::classify_complexity(&input); self.llm.select_model(tier) }; ``` The existing call to `agent_loop` at line 284 already passes `&model` and does not need a signature change — `agent_loop` uses `model` as-is in `LlmOptions { model: Some(model.to_string()), ... }` at lines 555–560. ### Step 2 — Extend `quick_response` signature and thread model through (crates/hero_agent/src/agent.rs) Dependencies: Step 1. 1. At line 325, add `selected_model: Option<&str>` as the last parameter of `quick_response`. 2. Replace the model-selection block at lines 370–377 with: ```rust let model_name = if let Some(config) = user_ai_config { tracing::info!("[QUICK_RESPONSE_MODEL] Using user AI model from BYOK config: {}", config.model); config.model.clone() } else if let Some(m) = selected_model { tracing::info!("[QUICK_RESPONSE_MODEL] Using UI-selected model: {}", m); m.to_string() } else { tracing::debug!("[QUICK_RESPONSE_MODEL] No BYOK or UI selection, using Lightweight model"); self.llm.select_model(ModelTier::Lightweight) }; ``` The rest of `quick_response` (building `LlmOptions { model: Some(model_name.clone()), ... }`) stays unchanged. ### Step 3 — Update chat handler to pass the UI model (crates/hero_agent_server/src/routes.rs) Dependencies: Step 1. 1. In the `chat()` function, after line 637 (after the `user_ai_config` match) and before the `tokio::spawn` at line 643, add: ```rust // Normalize UI-selected model: treat empty string and the "auto" sentinel as "no selection". let selected_model = request.model.as_ref().and_then(|m| { let t = m.trim(); if t.is_empty() || t.eq_ignore_ascii_case("auto") { None } else { Some(t.to_string()) } }); ``` 2. Update the `handle_message` call at lines 649–651 to pass `selected_model.as_deref()` as the new trailing argument. ### Step 4 — Update the other call sites in routes.rs to pass `None` (crates/hero_agent_server/src/routes.rs) Dependencies: Step 1. 1. Line 279 (`rpc_chat`): change to ```rust agent.handle_message(&message, &user_sid, &context, conv_id, false, None, None, None).await ``` 2. Line 984 (voice handler): change to ```rust .handle_message(&transcript, user_sid, context, conv_id, false, None, None, None) ``` ### Step 5 — Update channel call sites to pass `None` (crates/hero_agent/src/channels/*.rs) Dependencies: Step 1. 1. `crates/hero_agent/src/channels/whatsapp.rs` line 267: add a trailing `None`. 2. `crates/hero_agent/src/channels/telegram.rs` line 204: add a trailing `None`. 3. `crates/hero_agent/src/channels/cli.rs` line 63: add a trailing `None`. ### Step 6 — Build and verify Dependencies: Steps 1–5. 1. From `/home/rawan/codescalers/hero_agent`, run `cargo build -p hero_agent -p hero_agent_server` to confirm all 6 call sites compile. 2. Run `cargo test -p hero_agent_server` to confirm the 7 existing unit tests still pass. 3. Manual smoke (optional but recommended): start the server, open the Chat tab, pick a non-default model from the dropdown, send a message, and confirm logs show `[MODEL] Using UI-selected model: ...`. Confirm picking `auto` (or a config where models list is empty) falls back to the existing complexity-based path with `[MODEL] No BYOK or UI selection, selecting model by complexity`. ## Acceptance criteria - [ ] `Agent::handle_message` has a new `selected_model: Option<&str>` parameter at the end of its signature. - [ ] `Agent::quick_response` has a new `selected_model: Option<&str>` parameter at the end of its signature. - [ ] Chat route (`POST /api/chat`) passes the request's `model` (trimmed, with empty and `"auto"` treated as `None`) into `handle_message`. - [ ] BYOK `user_ai_config.model` takes precedence over `selected_model` in both `handle_message` (tool path) and `quick_response` (chitchat path). - [ ] When neither BYOK nor a UI selection is present, model selection falls back to the existing complexity-based behavior (tool path) or `Lightweight` (quick-response path). - [ ] All 5 non-chat call sites (`rpc_chat`, voice handler, whatsapp, telegram, cli) pass `None` for `selected_model` and their observable behavior is unchanged. - [ ] `cargo build -p hero_agent -p hero_agent_server` succeeds. - [ ] `cargo test -p hero_agent_server` passes (7/7 existing tests). - [ ] No frontend changes; `dashboard.js` still sends `{ message, model, stream }` as it does today. ## Notes / Edge cases - The frontend sends the literal string `auto` when the models list is empty (see `dashboard.js:521` — `models = [cfg.defaultModel || 'auto']`). That string is NOT a real model name on any provider, so step 3's normalization MUST strip it. Failing to strip `"auto"` would result in all providers rejecting the request. - `ChatRequest.model` already has `#[serde(default)]` on line 579–582 of routes.rs — no schema change is needed, clients omitting `model` continue to work. - `agent_loop` already accepts `model: &str` and builds `LlmOptions { model: Some(model.to_string()), ... }` — no change needed there. The one-line `options.model` lookup inside `LlmClient::chat_completion*` already uses the passed model and falls back to `default_model` only when `options.model` is `None`. - `LlmClient::default_model` is derived in constructor (`llm_client.rs:156–162`) as `aibroker_models[0] → openrouter_models[0] → groq_models[0] → "claude-sonnet-4.5"`. This continues to be the ultimate fallback; we only affect it when a UI selection is actively provided. - Do NOT validate the UI-selected model against the server's known models list. The provider chain (aibroker → openrouter → groq) already handles unknown model names by failing over. Validating in the handler would couple the handler to config and break the existing loose-coupling between UI dropdown population and request handling. - Logging: the `[MODEL]` / `[QUICK_RESPONSE_MODEL]` tracing prefixes already exist in today's code — keep the same style for the new branch so existing log-watching dashboards keep working. - BYOK precedence rule is preserved exactly as it is today: if the frontend proxy injects `user_ai_provider` + `user_ai_api_key` + `user_ai_model`, the `user_ai_config` is built (routes.rs:622–637) and wins over any `selected_model`. This matches the issue's requirement: "BYOK should still take precedence when present."
Author
Member

Test Results

cargo build -p hero_agent -p hero_agent_server succeeded (no errors, only pre-existing warnings).
cargo test -p hero_agent_server executed the unit test suite:

  • Total: 8
  • Passed: 8
  • Failed: 0
  • Ignored: 0
test routes::tests::test_orpheus_voice_mapping ... ok
test routes::tests::test_split_into_sentences_exclamation_question ... ok
test routes::tests::test_split_into_sentences_no_punctuation ... ok
test routes::tests::test_split_into_sentences_markdown_stripped ... ok
test routes::tests::test_split_into_sentences_empty ... ok
test routes::tests::test_split_into_sentences_min_length ... ok
test routes::tests::test_split_into_sentences_short_text ... ok
test routes::tests::test_split_into_sentences_basic ... ok

test result: ok. 8 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s

All pre-existing tests continue to pass. The change is a pure parameter addition and does not alter any of the behaviors exercised by the current test suite.

## Test Results `cargo build -p hero_agent -p hero_agent_server` succeeded (no errors, only pre-existing warnings). `cargo test -p hero_agent_server` executed the unit test suite: - Total: 8 - Passed: 8 - Failed: 0 - Ignored: 0 ``` test routes::tests::test_orpheus_voice_mapping ... ok test routes::tests::test_split_into_sentences_exclamation_question ... ok test routes::tests::test_split_into_sentences_no_punctuation ... ok test routes::tests::test_split_into_sentences_markdown_stripped ... ok test routes::tests::test_split_into_sentences_empty ... ok test routes::tests::test_split_into_sentences_min_length ... ok test routes::tests::test_split_into_sentences_short_text ... ok test routes::tests::test_split_into_sentences_basic ... ok test result: ok. 8 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out; finished in 0.00s ``` All pre-existing tests continue to pass. The change is a pure parameter addition and does not alter any of the behaviors exercised by the current test suite.
Author
Member

Implementation Summary

The Chat UI #chatModel dropdown is now honored server-side. ChatRequest.model is threaded from the /api/chat handler through Agent::handle_message into both the tool-loop (agent_loop) and chitchat (quick_response) model-selection paths. BYOK (user_ai_config) still takes precedence; when no BYOK is active and a non-empty, non-auto model is provided by the client, that model overrides LlmClient::default_model.

Files modified

  • crates/hero_agent/src/agent.rs — added selected_model: Option<&str> parameter to Agent::handle_message and Agent::quick_response; inserted a UI-selected branch between the BYOK and complexity/lightweight fallbacks in both.
  • crates/hero_agent_server/src/routes.rschat() normalizes request.model (trims whitespace, treats empty and the auto sentinel as None) and forwards it into handle_message. rpc_chat and the voice handler were updated to pass None — their observable behavior is unchanged.
  • crates/hero_agent/src/channels/whatsapp.rs — pass None for the new parameter.
  • crates/hero_agent/src/channels/telegram.rs — pass None for the new parameter.
  • crates/hero_agent/src/channels/cli.rs — pass None for the new parameter.

No frontend changes were needed (dashboard.js was already sending model). No tests were added; the existing unit-test surface does not cover handle_message or the chat route. A follow-up PR could add a small unit test for the model normalization helper ("" | "auto" | " Auto " -> None).

Behavior matrix

BYOK present request.model Model used
yes anything user_ai_config.model (BYOK, unchanged)
no non-empty and not auto UI-selected model (new behavior)
no empty string, auto, or omitted complexity-based (tool path) or Lightweight (chitchat path) — unchanged from today

Verification

  • cargo build -p hero_agent -p hero_agent_server — success.
  • cargo test -p hero_agent_server — 8 passed, 0 failed.

Diff size: 5 files changed, 23 insertions(+), 9 deletions(-).

Notes

  • The normalization deliberately does not validate the provided model against the configured provider lists. The aibroker -> openrouter -> groq provider chain in LlmClient already handles unknown model names by failing over, so validating in the handler would couple the HTTP layer to LLM configuration for no additional safety.
  • New tracing log lines use the existing [MODEL] / [QUICK_RESPONSE_MODEL] prefixes so existing log dashboards continue to work.
## Implementation Summary The Chat UI `#chatModel` dropdown is now honored server-side. `ChatRequest.model` is threaded from the `/api/chat` handler through `Agent::handle_message` into both the tool-loop (`agent_loop`) and chitchat (`quick_response`) model-selection paths. BYOK (`user_ai_config`) still takes precedence; when no BYOK is active and a non-empty, non-`auto` model is provided by the client, that model overrides `LlmClient::default_model`. ### Files modified - `crates/hero_agent/src/agent.rs` — added `selected_model: Option<&str>` parameter to `Agent::handle_message` and `Agent::quick_response`; inserted a UI-selected branch between the BYOK and complexity/lightweight fallbacks in both. - `crates/hero_agent_server/src/routes.rs` — `chat()` normalizes `request.model` (trims whitespace, treats empty and the `auto` sentinel as `None`) and forwards it into `handle_message`. `rpc_chat` and the voice handler were updated to pass `None` — their observable behavior is unchanged. - `crates/hero_agent/src/channels/whatsapp.rs` — pass `None` for the new parameter. - `crates/hero_agent/src/channels/telegram.rs` — pass `None` for the new parameter. - `crates/hero_agent/src/channels/cli.rs` — pass `None` for the new parameter. No frontend changes were needed (`dashboard.js` was already sending `model`). No tests were added; the existing unit-test surface does not cover `handle_message` or the `chat` route. A follow-up PR could add a small unit test for the `model` normalization helper (`"" | "auto" | " Auto " -> None`). ### Behavior matrix | BYOK present | `request.model` | Model used | |---|---|---| | yes | anything | `user_ai_config.model` (BYOK, unchanged) | | no | non-empty and not `auto` | UI-selected model (new behavior) | | no | empty string, `auto`, or omitted | complexity-based (tool path) or `Lightweight` (chitchat path) — unchanged from today | ### Verification - `cargo build -p hero_agent -p hero_agent_server` — success. - `cargo test -p hero_agent_server` — 8 passed, 0 failed. Diff size: 5 files changed, 23 insertions(+), 9 deletions(-). ### Notes - The normalization deliberately does not validate the provided model against the configured provider lists. The aibroker -> openrouter -> groq provider chain in `LlmClient` already handles unknown model names by failing over, so validating in the handler would couple the HTTP layer to LLM configuration for no additional safety. - New tracing log lines use the existing `[MODEL]` / `[QUICK_RESPONSE_MODEL]` prefixes so existing log dashboards continue to work.
rawan closed this issue 2026-04-23 15:13:16 +00:00
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_agent#5
No description provided.