Add support for alternative transcription models #16

Closed
opened 2026-04-14 10:01:35 +00:00 by casper-stevens · 5 comments
Member

Context

Currently transcription is locked to a single model. Users working in different environments — with access to OpenRouter or a locally running Whisper instance — have no way to switch providers. Supporting multiple backends makes the tool usable in air-gapped or cost-sensitive setups.

Goals

  • Add a transcription model selector in the settings UI
  • Support OpenRouter as a transcription backend (configurable API key and model name)
  • Support a local Whisper model as a transcription backend (configurable endpoint URL)
  • Persist the selected backend and its configuration alongside other user settings
  • Fall back gracefully with a clear error message when the selected backend is unreachable
## Context Currently transcription is locked to a single model. Users working in different environments — with access to OpenRouter or a locally running Whisper instance — have no way to switch providers. Supporting multiple backends makes the tool usable in air-gapped or cost-sensitive setups. ## Goals - Add a transcription model selector in the settings UI - Support OpenRouter as a transcription backend (configurable API key and model name) - Support a local Whisper model as a transcription backend (configurable endpoint URL) - Persist the selected backend and its configuration alongside other user settings - Fall back gracefully with a clear error message when the selected backend is unreachable
Author
Member

Implementation Spec for Issue #16

Objective

Extend hero_slides so users can select, configure, and persist the transcription backend used by voice.transcribe. Currently the server always calls AiClient::from_env() and hardcodes TranscriptionModel::WhisperLargeV3Turbo via the Groq provider. This spec adds:

  1. A user.settings.get / user.settings.save JSON-RPC pair backed by ~/.config/hero_slides/settings.json on the server.
  2. A transcription backend selector in the Settings UI (a new card in the Admin tab).
  3. Support for OpenRouter as a transcription backend (API key + model name).
  4. Support for a local Whisper backend (OpenAI-compatible endpoint URL).
  5. Graceful fallback with a clear error message when the selected backend is unreachable.

Requirements

  • The server persists user settings in ~/.config/hero_slides/settings.json (created on first save).
  • The settings object includes a transcription sub-object with:
    • backend: one of "groq" | "openrouter" | "local_whisper" (default "groq").
    • openrouter_api_key: optional string.
    • openrouter_model: optional string (defaults to "openai/whisper-1").
    • local_whisper_url: optional string (base URL, e.g. "http://localhost:9000").
  • Two new RPC methods: user.settings.get (returns settings object) and user.settings.save (partial update, returns {saved: true}).
  • voice.transcribe reads persisted settings at call time and constructs the correct AiClient.
  • If the selected backend is unreachable, the error message names the backend and gives actionable guidance.
  • The Admin tab gains a "Transcription Settings" card to read and write these settings.
  • No new external Rust crates required (dirs is already present).
  • openrpc.json is updated with the two new methods.

Files to Modify

File Action Description
crates/hero_slides_lib/src/voice.rs Modify Add TranscriptionBackend enum; update voice_transcribe to accept optional backend config
crates/hero_slides_server/src/rpc.rs Modify Add settings structs, persistence helpers, two RPC handlers, update handle_voice_transcribe
crates/hero_slides_server/openrpc.json Modify Add user.settings.get and user.settings.save entries
crates/hero_slides_ui/templates/index.html Modify Add Transcription Settings card in Admin tab
crates/hero_slides_ui/static/js/dashboard.js Modify Add loadSettings(), saveTranscriptionSettings(), onTranscriptionBackendChange()

Implementation Plan

Step 1: Add TranscriptionBackend enum to voice.rs

Files: crates/hero_slides_lib/src/voice.rs, crates/hero_slides_lib/src/lib.rs

  • Add pub enum TranscriptionBackend { Groq, OpenRouter { api_key, model }, LocalWhisper { base_url } }
  • Update voice_transcribe to accept backend: Option<TranscriptionBackend>
  • For Groq/None: keep current behaviour
  • For OpenRouter and LocalWhisper: build the HTTP multipart POST directly (bypass AiClient::transcribe_bytes) to avoid TranscriptionModel enum limitations
  • Re-export TranscriptionBackend from lib.rs
    Dependencies: none

Step 2: Add settings persistence and RPC handlers in rpc.rs

Files: crates/hero_slides_server/src/rpc.rs

  • Add UserSettings and TranscriptionSettings structs (serde)
  • Add settings_path(), load_user_settings(), save_user_settings() helpers
  • Add handle_user_settings_get() and handle_user_settings_save() async handlers
  • Wire into handle_request dispatch
    Dependencies: none

Step 3: Update handle_voice_transcribe to use settings

Files: crates/hero_slides_server/src/rpc.rs

  • Load settings at start of handler
  • Resolve TranscriptionBackend from settings
  • Pass backend into updated hero_slides_lib::voice_transcribe
  • Map errors to include backend label
    Dependencies: Steps 1 and 2

Step 4: Update openrpc.json

Files: crates/hero_slides_server/openrpc.json

  • Append user.settings.get and user.settings.save method entries with full schemas
    Dependencies: none

Step 5: Add Settings card to index.html

Files: crates/hero_slides_ui/templates/index.html

  • Add a "Transcription Settings" <div class="admin-section"> inside #tab-admin
  • Backend <select> with options: Groq, OpenRouter, Local Whisper
  • Conditional fields for OpenRouter (API key, model) and Local Whisper (URL)
  • Save button calling saveTranscriptionSettings()
    Dependencies: none

Step 6: Add JavaScript handlers in dashboard.js

Files: crates/hero_slides_ui/static/js/dashboard.js

  • loadSettings(): calls user.settings.get RPC, populates form fields
  • onTranscriptionBackendChange(): shows/hides conditional fields
  • saveTranscriptionSettings(): calls user.settings.save RPC, shows toast
  • Call loadSettings() on DOMContentLoaded
    Dependencies: Steps 2 and 5

Acceptance Criteria

  • user.settings.get returns { transcription: { backend, openrouter_api_key, openrouter_model, local_whisper_url } }
  • user.settings.save persists to ~/.config/hero_slides/settings.json and returns { saved: true }
  • Settings survive a server restart
  • Groq backend works identically to current implementation
  • OpenRouter backend successfully transcribes with a valid API key
  • Local Whisper backend works against a running local server
  • Unreachable backend returns a named, actionable error message
  • Admin tab shows the Transcription Settings card with conditional fields
  • Page load pre-populates form with persisted settings
  • openrpc.json includes both new method entries

Notes

  • TranscriptionModel enum in herolib_ai only has Groq-backed variants. For OpenRouter and local Whisper, build the multipart POST directly rather than going through AiClient::transcribe_bytes, which cannot express arbitrary model names.
  • The Mercury2 cleanup step (chat completion after transcription) is not affected — it continues to use AiClient::from_env().
  • The OpenRouter API key is stored in plaintext in settings.json — acceptable for a single-user local server, same as env vars.
  • dirs crate is already in both Cargo.toml files; no new dependency needed.
## Implementation Spec for Issue #16 ### Objective Extend `hero_slides` so users can select, configure, and persist the transcription backend used by `voice.transcribe`. Currently the server always calls `AiClient::from_env()` and hardcodes `TranscriptionModel::WhisperLargeV3Turbo` via the Groq provider. This spec adds: 1. A `user.settings.get` / `user.settings.save` JSON-RPC pair backed by `~/.config/hero_slides/settings.json` on the server. 2. A transcription backend selector in the Settings UI (a new card in the Admin tab). 3. Support for **OpenRouter** as a transcription backend (API key + model name). 4. Support for a **local Whisper** backend (OpenAI-compatible endpoint URL). 5. Graceful fallback with a clear error message when the selected backend is unreachable. --- ### Requirements - The server persists user settings in `~/.config/hero_slides/settings.json` (created on first save). - The settings object includes a `transcription` sub-object with: - `backend`: one of `"groq"` | `"openrouter"` | `"local_whisper"` (default `"groq"`). - `openrouter_api_key`: optional string. - `openrouter_model`: optional string (defaults to `"openai/whisper-1"`). - `local_whisper_url`: optional string (base URL, e.g. `"http://localhost:9000"`). - Two new RPC methods: `user.settings.get` (returns settings object) and `user.settings.save` (partial update, returns `{saved: true}`). - `voice.transcribe` reads persisted settings at call time and constructs the correct `AiClient`. - If the selected backend is unreachable, the error message names the backend and gives actionable guidance. - The Admin tab gains a "Transcription Settings" card to read and write these settings. - No new external Rust crates required (`dirs` is already present). - `openrpc.json` is updated with the two new methods. --- ### Files to Modify | File | Action | Description | |---|---|---| | `crates/hero_slides_lib/src/voice.rs` | Modify | Add `TranscriptionBackend` enum; update `voice_transcribe` to accept optional backend config | | `crates/hero_slides_server/src/rpc.rs` | Modify | Add settings structs, persistence helpers, two RPC handlers, update `handle_voice_transcribe` | | `crates/hero_slides_server/openrpc.json` | Modify | Add `user.settings.get` and `user.settings.save` entries | | `crates/hero_slides_ui/templates/index.html` | Modify | Add Transcription Settings card in Admin tab | | `crates/hero_slides_ui/static/js/dashboard.js` | Modify | Add `loadSettings()`, `saveTranscriptionSettings()`, `onTranscriptionBackendChange()` | --- ### Implementation Plan #### Step 1: Add `TranscriptionBackend` enum to `voice.rs` Files: `crates/hero_slides_lib/src/voice.rs`, `crates/hero_slides_lib/src/lib.rs` - Add `pub enum TranscriptionBackend { Groq, OpenRouter { api_key, model }, LocalWhisper { base_url } }` - Update `voice_transcribe` to accept `backend: Option<TranscriptionBackend>` - For `Groq`/`None`: keep current behaviour - For `OpenRouter` and `LocalWhisper`: build the HTTP multipart POST directly (bypass `AiClient::transcribe_bytes`) to avoid `TranscriptionModel` enum limitations - Re-export `TranscriptionBackend` from `lib.rs` Dependencies: none #### Step 2: Add settings persistence and RPC handlers in `rpc.rs` Files: `crates/hero_slides_server/src/rpc.rs` - Add `UserSettings` and `TranscriptionSettings` structs (serde) - Add `settings_path()`, `load_user_settings()`, `save_user_settings()` helpers - Add `handle_user_settings_get()` and `handle_user_settings_save()` async handlers - Wire into `handle_request` dispatch Dependencies: none #### Step 3: Update `handle_voice_transcribe` to use settings Files: `crates/hero_slides_server/src/rpc.rs` - Load settings at start of handler - Resolve `TranscriptionBackend` from settings - Pass backend into updated `hero_slides_lib::voice_transcribe` - Map errors to include backend label Dependencies: Steps 1 and 2 #### Step 4: Update `openrpc.json` Files: `crates/hero_slides_server/openrpc.json` - Append `user.settings.get` and `user.settings.save` method entries with full schemas Dependencies: none #### Step 5: Add Settings card to `index.html` Files: `crates/hero_slides_ui/templates/index.html` - Add a "Transcription Settings" `<div class="admin-section">` inside `#tab-admin` - Backend `<select>` with options: Groq, OpenRouter, Local Whisper - Conditional fields for OpenRouter (API key, model) and Local Whisper (URL) - Save button calling `saveTranscriptionSettings()` Dependencies: none #### Step 6: Add JavaScript handlers in `dashboard.js` Files: `crates/hero_slides_ui/static/js/dashboard.js` - `loadSettings()`: calls `user.settings.get` RPC, populates form fields - `onTranscriptionBackendChange()`: shows/hides conditional fields - `saveTranscriptionSettings()`: calls `user.settings.save` RPC, shows toast - Call `loadSettings()` on `DOMContentLoaded` Dependencies: Steps 2 and 5 --- ### Acceptance Criteria - [ ] `user.settings.get` returns `{ transcription: { backend, openrouter_api_key, openrouter_model, local_whisper_url } }` - [ ] `user.settings.save` persists to `~/.config/hero_slides/settings.json` and returns `{ saved: true }` - [ ] Settings survive a server restart - [ ] Groq backend works identically to current implementation - [ ] OpenRouter backend successfully transcribes with a valid API key - [ ] Local Whisper backend works against a running local server - [ ] Unreachable backend returns a named, actionable error message - [ ] Admin tab shows the Transcription Settings card with conditional fields - [ ] Page load pre-populates form with persisted settings - [ ] `openrpc.json` includes both new method entries --- ### Notes - `TranscriptionModel` enum in `herolib_ai` only has Groq-backed variants. For OpenRouter and local Whisper, build the multipart POST directly rather than going through `AiClient::transcribe_bytes`, which cannot express arbitrary model names. - The Mercury2 cleanup step (chat completion after transcription) is not affected — it continues to use `AiClient::from_env()`. - The OpenRouter API key is stored in plaintext in `settings.json` — acceptable for a single-user local server, same as env vars. - `dirs` crate is already in both `Cargo.toml` files; no new dependency needed.
Author
Member

Implementation Spec for Issue #16 (revised)

Objective

Extend hero_slides so users can select, configure, and persist the transcription backend used by voice.transcribe. This spec adds:

  1. A user.settings.get / user.settings.save JSON-RPC pair backed by ~/.config/hero_slides/settings.json on the server.
  2. A transcription backend selector in the Settings UI (a new card in the Admin tab).
  3. Support for OpenRouter as a transcription backend (API key + model name).
  4. Support for any local model with an OpenAI-compatible /audio/transcriptions endpoint (URL + model name). This covers Whisper, Voxtral, and any other compatible server.
  5. Graceful fallback with a clear error message when the selected backend is unreachable.

Requirements

  • The server persists user settings in ~/.config/hero_slides/settings.json (created on first save).
  • The settings object includes a transcription sub-object with:
    • backend: one of "groq" | "openrouter" | "local_model" (default "groq").
    • openrouter_api_key: string (API key for OpenRouter).
    • openrouter_model: string (model ID, e.g. "openai/whisper-1"; default "openai/whisper-1").
    • local_model_url: string (base URL, e.g. "http://localhost:9000").
    • local_model_name: string (model name sent in the multipart request, e.g. "whisper-1", "voxtral-1"; default "whisper-1").
  • Two new RPC methods: user.settings.get and user.settings.save.
  • voice.transcribe reads persisted settings at call time and builds the correct HTTP request.
  • Unreachable backend returns an error naming the backend with actionable guidance.
  • The Admin tab gains a "Transcription Settings" card.
  • No new external Rust crates required (dirs is already present).
  • openrpc.json is updated with the two new methods.

Files to Modify

File Action Description
crates/hero_slides_lib/src/voice.rs Modify Add TranscriptionBackend enum; update voice_transcribe to accept optional backend config
crates/hero_slides_server/src/rpc.rs Modify Add settings structs, persistence helpers, two RPC handlers, update handle_voice_transcribe
crates/hero_slides_server/openrpc.json Modify Add user.settings.get and user.settings.save entries
crates/hero_slides_ui/templates/index.html Modify Add Transcription Settings card in Admin tab
crates/hero_slides_ui/static/js/dashboard.js Modify Add loadSettings(), saveTranscriptionSettings(), onTranscriptionBackendChange()

Implementation Plan

Step 1: Add TranscriptionBackend enum to voice.rs

Files: crates/hero_slides_lib/src/voice.rs, crates/hero_slides_lib/src/lib.rs

  • Add:
    pub enum TranscriptionBackend {
        Groq,
        OpenRouter { api_key: String, model: String },
        LocalModel { base_url: String, model_name: String },
    }
    
  • Update voice_transcribe to accept backend: Option<TranscriptionBackend>
  • For Groq/None: keep current behaviour (use AiClient::from_env())
  • For OpenRouter and LocalModel: build the multipart POST directly to avoid TranscriptionModel enum limitations — POST to {base_url}/audio/transcriptions with model field set to the user-supplied model name
  • Re-export TranscriptionBackend from lib.rs
    Dependencies: none

Step 2: Add settings persistence and RPC handlers in rpc.rs

Files: crates/hero_slides_server/src/rpc.rs

  • Add TranscriptionSettings struct:
    pub struct TranscriptionSettings {
        pub backend: String,           // "groq" | "openrouter" | "local_model"
        pub openrouter_api_key: String,
        pub openrouter_model: String,  // default "openai/whisper-1"
        pub local_model_url: String,   // e.g. "http://localhost:9000"
        pub local_model_name: String,  // e.g. "whisper-1", "voxtral-1"
    }
    
  • Add UserSettings { transcription: TranscriptionSettings } struct
  • Add settings_path(), load_user_settings(), save_user_settings() helpers
  • Add handle_user_settings_get() and handle_user_settings_save() handlers
  • Wire into handle_request dispatch
    Dependencies: none

Step 3: Update handle_voice_transcribe to use settings

Files: crates/hero_slides_server/src/rpc.rs

  • Load settings at start of handler
  • Resolve TranscriptionBackend from settings (backend field)
  • Pass into updated hero_slides_lib::voice_transcribe
  • Map errors to include backend label
    Dependencies: Steps 1 and 2

Step 4: Update openrpc.json

Files: crates/hero_slides_server/openrpc.json

  • Append user.settings.get and user.settings.save method entries including local_model_url and local_model_name fields
    Dependencies: none

Step 5: Add Settings card to index.html

Files: crates/hero_slides_ui/templates/index.html

  • Add "Transcription Settings" <div class="admin-section"> inside #tab-admin
  • Backend <select>: Groq (default), OpenRouter, Local Model (OpenAI-compatible)
  • OpenRouter fields: API key input, model input (placeholder: openai/whisper-1)
  • Local Model fields: URL input (placeholder: http://localhost:9000), model name input (placeholder: whisper-1, voxtral-1, ...)
  • Save button calling saveTranscriptionSettings()
    Dependencies: none

Step 6: Add JavaScript handlers in dashboard.js

Files: crates/hero_slides_ui/static/js/dashboard.js

  • loadSettings(): calls user.settings.get, populates all form fields
  • onTranscriptionBackendChange(): shows/hides OpenRouter or Local Model fields
  • saveTranscriptionSettings(): calls user.settings.save with all fields including local_model_url and local_model_name
  • Call loadSettings() on DOMContentLoaded
    Dependencies: Steps 2 and 5

Acceptance Criteria

  • user.settings.get returns { transcription: { backend, openrouter_api_key, openrouter_model, local_model_url, local_model_name } }
  • user.settings.save persists to ~/.config/hero_slides/settings.json and returns { saved: true }
  • Settings survive a server restart
  • Groq backend works identically to current implementation
  • OpenRouter backend transcribes successfully with a valid API key and model
  • Local Model backend works against any OpenAI-compatible transcription server (tested with Whisper and/or Voxtral)
  • Unreachable backend returns a named, actionable error
  • Admin tab shows the Transcription Settings card with correct conditional fields
  • Local Model fields show URL and model name inputs with descriptive placeholders
  • Page load pre-populates form with persisted settings
  • openrpc.json includes both new method entries with local_model_url and local_model_name

Notes

  • The local_model_name field is sent as the model parameter in the multipart POST to /audio/transcriptions. Any OpenAI-compatible server (Whisper.cpp, Voxtral, Faster-Whisper, etc.) reads this field — setting it correctly is the user's responsibility.
  • TranscriptionModel enum in herolib_ai only covers Groq variants. OpenRouter and local model backends bypass AiClient::transcribe_bytes and build the multipart POST directly.
  • Mercury2 cleanup (chat completion after transcription) is not affected by this change.
  • The OpenRouter API key is stored in plaintext — acceptable for a single-user local server.
## Implementation Spec for Issue #16 (revised) ### Objective Extend `hero_slides` so users can select, configure, and persist the transcription backend used by `voice.transcribe`. This spec adds: 1. A `user.settings.get` / `user.settings.save` JSON-RPC pair backed by `~/.config/hero_slides/settings.json` on the server. 2. A transcription backend selector in the Settings UI (a new card in the Admin tab). 3. Support for **OpenRouter** as a transcription backend (API key + model name). 4. Support for any **local model** with an OpenAI-compatible `/audio/transcriptions` endpoint (URL + model name). This covers Whisper, Voxtral, and any other compatible server. 5. Graceful fallback with a clear error message when the selected backend is unreachable. --- ### Requirements - The server persists user settings in `~/.config/hero_slides/settings.json` (created on first save). - The settings object includes a `transcription` sub-object with: - `backend`: one of `"groq"` | `"openrouter"` | `"local_model"` (default `"groq"`). - `openrouter_api_key`: string (API key for OpenRouter). - `openrouter_model`: string (model ID, e.g. `"openai/whisper-1"`; default `"openai/whisper-1"`). - `local_model_url`: string (base URL, e.g. `"http://localhost:9000"`). - `local_model_name`: string (model name sent in the multipart request, e.g. `"whisper-1"`, `"voxtral-1"`; default `"whisper-1"`). - Two new RPC methods: `user.settings.get` and `user.settings.save`. - `voice.transcribe` reads persisted settings at call time and builds the correct HTTP request. - Unreachable backend returns an error naming the backend with actionable guidance. - The Admin tab gains a "Transcription Settings" card. - No new external Rust crates required (`dirs` is already present). - `openrpc.json` is updated with the two new methods. --- ### Files to Modify | File | Action | Description | |---|---|---| | `crates/hero_slides_lib/src/voice.rs` | Modify | Add `TranscriptionBackend` enum; update `voice_transcribe` to accept optional backend config | | `crates/hero_slides_server/src/rpc.rs` | Modify | Add settings structs, persistence helpers, two RPC handlers, update `handle_voice_transcribe` | | `crates/hero_slides_server/openrpc.json` | Modify | Add `user.settings.get` and `user.settings.save` entries | | `crates/hero_slides_ui/templates/index.html` | Modify | Add Transcription Settings card in Admin tab | | `crates/hero_slides_ui/static/js/dashboard.js` | Modify | Add `loadSettings()`, `saveTranscriptionSettings()`, `onTranscriptionBackendChange()` | --- ### Implementation Plan #### Step 1: Add `TranscriptionBackend` enum to `voice.rs` Files: `crates/hero_slides_lib/src/voice.rs`, `crates/hero_slides_lib/src/lib.rs` - Add: ```rust pub enum TranscriptionBackend { Groq, OpenRouter { api_key: String, model: String }, LocalModel { base_url: String, model_name: String }, } ``` - Update `voice_transcribe` to accept `backend: Option<TranscriptionBackend>` - For `Groq`/`None`: keep current behaviour (use `AiClient::from_env()`) - For `OpenRouter` and `LocalModel`: build the multipart POST directly to avoid `TranscriptionModel` enum limitations — POST to `{base_url}/audio/transcriptions` with `model` field set to the user-supplied model name - Re-export `TranscriptionBackend` from `lib.rs` Dependencies: none #### Step 2: Add settings persistence and RPC handlers in `rpc.rs` Files: `crates/hero_slides_server/src/rpc.rs` - Add `TranscriptionSettings` struct: ```rust pub struct TranscriptionSettings { pub backend: String, // "groq" | "openrouter" | "local_model" pub openrouter_api_key: String, pub openrouter_model: String, // default "openai/whisper-1" pub local_model_url: String, // e.g. "http://localhost:9000" pub local_model_name: String, // e.g. "whisper-1", "voxtral-1" } ``` - Add `UserSettings { transcription: TranscriptionSettings }` struct - Add `settings_path()`, `load_user_settings()`, `save_user_settings()` helpers - Add `handle_user_settings_get()` and `handle_user_settings_save()` handlers - Wire into `handle_request` dispatch Dependencies: none #### Step 3: Update `handle_voice_transcribe` to use settings Files: `crates/hero_slides_server/src/rpc.rs` - Load settings at start of handler - Resolve `TranscriptionBackend` from settings (`backend` field) - Pass into updated `hero_slides_lib::voice_transcribe` - Map errors to include backend label Dependencies: Steps 1 and 2 #### Step 4: Update `openrpc.json` Files: `crates/hero_slides_server/openrpc.json` - Append `user.settings.get` and `user.settings.save` method entries including `local_model_url` and `local_model_name` fields Dependencies: none #### Step 5: Add Settings card to `index.html` Files: `crates/hero_slides_ui/templates/index.html` - Add "Transcription Settings" `<div class="admin-section">` inside `#tab-admin` - Backend `<select>`: Groq (default), OpenRouter, Local Model (OpenAI-compatible) - OpenRouter fields: API key input, model input (placeholder: `openai/whisper-1`) - Local Model fields: URL input (placeholder: `http://localhost:9000`), model name input (placeholder: `whisper-1, voxtral-1, ...`) - Save button calling `saveTranscriptionSettings()` Dependencies: none #### Step 6: Add JavaScript handlers in `dashboard.js` Files: `crates/hero_slides_ui/static/js/dashboard.js` - `loadSettings()`: calls `user.settings.get`, populates all form fields - `onTranscriptionBackendChange()`: shows/hides OpenRouter or Local Model fields - `saveTranscriptionSettings()`: calls `user.settings.save` with all fields including `local_model_url` and `local_model_name` - Call `loadSettings()` on `DOMContentLoaded` Dependencies: Steps 2 and 5 --- ### Acceptance Criteria - [ ] `user.settings.get` returns `{ transcription: { backend, openrouter_api_key, openrouter_model, local_model_url, local_model_name } }` - [ ] `user.settings.save` persists to `~/.config/hero_slides/settings.json` and returns `{ saved: true }` - [ ] Settings survive a server restart - [ ] Groq backend works identically to current implementation - [ ] OpenRouter backend transcribes successfully with a valid API key and model - [ ] Local Model backend works against any OpenAI-compatible transcription server (tested with Whisper and/or Voxtral) - [ ] Unreachable backend returns a named, actionable error - [ ] Admin tab shows the Transcription Settings card with correct conditional fields - [ ] Local Model fields show URL and model name inputs with descriptive placeholders - [ ] Page load pre-populates form with persisted settings - [ ] `openrpc.json` includes both new method entries with `local_model_url` and `local_model_name` --- ### Notes - The `local_model_name` field is sent as the `model` parameter in the multipart POST to `/audio/transcriptions`. Any OpenAI-compatible server (Whisper.cpp, Voxtral, Faster-Whisper, etc.) reads this field — setting it correctly is the user's responsibility. - `TranscriptionModel` enum in `herolib_ai` only covers Groq variants. OpenRouter and local model backends bypass `AiClient::transcribe_bytes` and build the multipart POST directly. - Mercury2 cleanup (chat completion after transcription) is not affected by this change. - The OpenRouter API key is stored in plaintext — acceptable for a single-user local server.
Author
Member

Test Results

  • Result: PASS
  • Total: 75
  • Passed: 74
  • Failed: 0
  • Ignored: 1 (test_generate_single_slide_ai - skipped, requires AI service)

Breakdown by crate

Crate Tests Passed Ignored
hero_slides (bin) 0 0 0
hero_slides_lib (unit) 47 47 0
hero_slides_lib (integration) 13 12 1
hero_slides_rhai 0 0 0
hero_slides_sdk 0 0 0
hero_slides_server 0 0 0
hero_slides_ui 0 0 0
doc-tests hero_slides_lib 2 2 0
doc-tests hero_slides_rhai 1 1 0
doc-tests hero_slides_sdk 0 0 0

All tests passed. Build completed in 19.01s. Two warnings (unused imports/variables) noted but no errors.

## Test Results - Result: PASS - Total: 75 - Passed: 74 - Failed: 0 - Ignored: 1 (test_generate_single_slide_ai - skipped, requires AI service) ### Breakdown by crate | Crate | Tests | Passed | Ignored | |---|---|---|---| | hero_slides (bin) | 0 | 0 | 0 | | hero_slides_lib (unit) | 47 | 47 | 0 | | hero_slides_lib (integration) | 13 | 12 | 1 | | hero_slides_rhai | 0 | 0 | 0 | | hero_slides_sdk | 0 | 0 | 0 | | hero_slides_server | 0 | 0 | 0 | | hero_slides_ui | 0 | 0 | 0 | | doc-tests hero_slides_lib | 2 | 2 | 0 | | doc-tests hero_slides_rhai | 1 | 1 | 0 | | doc-tests hero_slides_sdk | 0 | 0 | 0 | All tests passed. Build completed in 19.01s. Two warnings (unused imports/variables) noted but no errors.
Author
Member

Implementation Summary

All changes have been implemented across 5 files.

Changes Made

crates/hero_slides_lib/src/voice.rs

  • Added TranscriptionBackend enum with three variants: Groq, OpenRouter { api_key, model }, and LocalModel { base_url, model_name }
  • Updated voice_transcribe to accept backend: Option<TranscriptionBackend> as a fourth parameter
  • None / Groq: unchanged behaviour (uses AiClient::from_env() + WhisperLargeV3Turbo)
  • OpenRouter and LocalModel: build multipart POST directly to /audio/transcriptions endpoint, bypassing AiClient::transcribe_bytes to allow arbitrary model names (e.g. voxtral-1)
  • Error messages include the backend label for actionable diagnostics

crates/hero_slides_lib/src/lib.rs

  • Re-exported TranscriptionBackend

crates/hero_slides_server/src/rpc.rs

  • Added UserSettings and TranscriptionSettings structs with serde derive and defaults
  • Added settings_path(), load_user_settings(), save_user_settings() helpers (backed by ~/.config/hero_slides/settings.json)
  • Added handle_user_settings_get() and handle_user_settings_save() async handlers
  • Wired both handlers into the handle_request dispatch
  • Updated handle_voice_transcribe to load settings, resolve the correct backend, and pass it to voice_transcribe

crates/hero_slides_server/openrpc.json

  • Added user.settings.get and user.settings.save method entries with full schemas including local_model_url and local_model_name fields

crates/hero_slides_ui/templates/index.html

  • Added "Transcription Settings" admin section in the Admin tab
  • Backend selector: Groq (default), OpenRouter, Local Model (OpenAI-compatible)
  • Conditional fields: OpenRouter shows API key + model inputs; Local Model shows endpoint URL + model name inputs with descriptive placeholders

crates/hero_slides_ui/static/js/dashboard.js

  • Added loadSettings(): populates settings form from user.settings.get on page load
  • Added onTranscriptionBackendChange(): shows/hides conditional fields based on selected backend
  • Added saveTranscriptionSettings(): calls user.settings.save and shows success/error toast

Test Results

  • Total: 75
  • Passed: 74
  • Failed: 0
  • Ignored: 1 (requires live AI service)
## Implementation Summary All changes have been implemented across 5 files. ### Changes Made **`crates/hero_slides_lib/src/voice.rs`** - Added `TranscriptionBackend` enum with three variants: `Groq`, `OpenRouter { api_key, model }`, and `LocalModel { base_url, model_name }` - Updated `voice_transcribe` to accept `backend: Option<TranscriptionBackend>` as a fourth parameter - `None` / `Groq`: unchanged behaviour (uses `AiClient::from_env()` + WhisperLargeV3Turbo) - `OpenRouter` and `LocalModel`: build multipart POST directly to `/audio/transcriptions` endpoint, bypassing `AiClient::transcribe_bytes` to allow arbitrary model names (e.g. `voxtral-1`) - Error messages include the backend label for actionable diagnostics **`crates/hero_slides_lib/src/lib.rs`** - Re-exported `TranscriptionBackend` **`crates/hero_slides_server/src/rpc.rs`** - Added `UserSettings` and `TranscriptionSettings` structs with serde derive and defaults - Added `settings_path()`, `load_user_settings()`, `save_user_settings()` helpers (backed by `~/.config/hero_slides/settings.json`) - Added `handle_user_settings_get()` and `handle_user_settings_save()` async handlers - Wired both handlers into the `handle_request` dispatch - Updated `handle_voice_transcribe` to load settings, resolve the correct backend, and pass it to `voice_transcribe` **`crates/hero_slides_server/openrpc.json`** - Added `user.settings.get` and `user.settings.save` method entries with full schemas including `local_model_url` and `local_model_name` fields **`crates/hero_slides_ui/templates/index.html`** - Added "Transcription Settings" admin section in the Admin tab - Backend selector: Groq (default), OpenRouter, Local Model (OpenAI-compatible) - Conditional fields: OpenRouter shows API key + model inputs; Local Model shows endpoint URL + model name inputs with descriptive placeholders **`crates/hero_slides_ui/static/js/dashboard.js`** - Added `loadSettings()`: populates settings form from `user.settings.get` on page load - Added `onTranscriptionBackendChange()`: shows/hides conditional fields based on selected backend - Added `saveTranscriptionSettings()`: calls `user.settings.save` and shows success/error toast ### Test Results - Total: 75 - Passed: 74 - Failed: 0 - Ignored: 1 (requires live AI service)
Author
Member

Pull request opened: #18

Pull request opened: https://forge.ourworld.tf/lhumina_code/hero_slides/pulls/18
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/hero_slides#16
No description provided.