AI Assistant: Progressive SSE Streaming (word-by-word response rendering) #32

Open
opened 2026-03-18 01:50:28 +00:00 by mik-tf · 0 comments
Owner

Current State

The AI assistant (Shrimp) returns responses via Server-Sent Events (SSE). The current implementation reads the stream incrementally and waits for the event: done event before displaying the response.

What works now:

  • Stream is read chunk-by-chunk (not blocked on full body)
  • Response appears as soon as the LLM finishes (when done event arrives)
  • Abort/cancel works via AbortSignal
  • No more infinite spin on slow responses

What's missing:

  • Responses appear all at once after the LLM finishes thinking
  • No visual feedback during generation (just a spinner)
  • Multi-step agent tasks show nothing until all steps complete

The Enhancement

Show the AI response progressively as it's generated, word by word — like ChatGPT, Claude web, etc.

Shrimp already sends intermediate SSE events during generation:

  • event: token — partial content as the LLM generates tokens
  • event: tool_call — when the agent uses a tool
  • event: tool_result — tool execution result
  • event: done — final complete response

We currently ignore token events and only process done. Progressive streaming would render token events in real-time.

Implementation

1. Service layer (ai_service.rs)

Change send_message to accept a callback for streaming updates:

pub async fn send_message_streaming(
    shrimp_url: &str,
    user_message: &str,
    conversation_id: Option<String>,
    on_token: impl Fn(&str),  // Called for each token chunk
    abort_signal: Option<web_sys::AbortSignal>,
) -> Result<String, String>

Or return a Stream that yields partial updates.

2. UI component (island.rs / chat view)

Update the message state model:

  • Current: PendingComplete
  • New: PendingStreaming(partial_content)Complete(full_content)

The chat bubble renders Streaming state with a blinking cursor and growing text.

3. Token event parsing

Process event: token in the SSE reader loop:

if collected_since_last.contains("event: token\n") {
    // Extract data, call on_token callback
    // UI updates the streaming message bubble
}

4. Agent tool use visualization (stretch goal)

When Shrimp uses tools (web search, file operations, etc.), show status:

  • "Searching the web..."
  • "Reading documentation..."
  • "Executing code..."

This requires parsing event: tool_call and event: tool_result events.

Files

File Change
hero_archipelagos/archipelagos/intelligence/ai/src/services/ai_service.rs Streaming API + token event parsing
hero_archipelagos/archipelagos/intelligence/ai/src/island.rs Streaming message state + UI rendering
hero_archipelagos/archipelagos/intelligence/ai/src/views/ Chat bubble streaming animation

Priority

Medium — the current fix prevents infinite spin. This enhancement improves UX but is not blocking.

## Current State The AI assistant (Shrimp) returns responses via Server-Sent Events (SSE). The current implementation reads the stream incrementally and waits for the `event: done` event before displaying the response. **What works now:** - Stream is read chunk-by-chunk (not blocked on full body) - Response appears as soon as the LLM finishes (when `done` event arrives) - Abort/cancel works via AbortSignal - No more infinite spin on slow responses **What's missing:** - Responses appear all at once after the LLM finishes thinking - No visual feedback during generation (just a spinner) - Multi-step agent tasks show nothing until all steps complete ## The Enhancement Show the AI response **progressively as it's generated**, word by word — like ChatGPT, Claude web, etc. Shrimp already sends intermediate SSE events during generation: - `event: token` — partial content as the LLM generates tokens - `event: tool_call` — when the agent uses a tool - `event: tool_result` — tool execution result - `event: done` — final complete response We currently ignore `token` events and only process `done`. Progressive streaming would render `token` events in real-time. ## Implementation ### 1. Service layer (`ai_service.rs`) Change `send_message` to accept a callback for streaming updates: ```rust pub async fn send_message_streaming( shrimp_url: &str, user_message: &str, conversation_id: Option<String>, on_token: impl Fn(&str), // Called for each token chunk abort_signal: Option<web_sys::AbortSignal>, ) -> Result<String, String> ``` Or return a `Stream` that yields partial updates. ### 2. UI component (`island.rs` / chat view) Update the message state model: - Current: `Pending` → `Complete` - New: `Pending` → `Streaming(partial_content)` → `Complete(full_content)` The chat bubble renders `Streaming` state with a blinking cursor and growing text. ### 3. Token event parsing Process `event: token` in the SSE reader loop: ```rust if collected_since_last.contains("event: token\n") { // Extract data, call on_token callback // UI updates the streaming message bubble } ``` ### 4. Agent tool use visualization (stretch goal) When Shrimp uses tools (web search, file operations, etc.), show status: - "Searching the web..." - "Reading documentation..." - "Executing code..." This requires parsing `event: tool_call` and `event: tool_result` events. ## Files | File | Change | |------|--------| | `hero_archipelagos/archipelagos/intelligence/ai/src/services/ai_service.rs` | Streaming API + token event parsing | | `hero_archipelagos/archipelagos/intelligence/ai/src/island.rs` | Streaming message state + UI rendering | | `hero_archipelagos/archipelagos/intelligence/ai/src/views/` | Chat bubble streaming animation | ## Priority Medium — the current fix prevents infinite spin. This enhancement improves UX but is not blocking.
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
lhumina_code/home#32
No description provided.