No description
  • Rust 89.1%
  • HTML 10.9%
Find a file
mik-tf ee3431be84 hero_proc#102 D-10 sweep: hero_researcher canonical service.toml + service_base!() + herolib_ai v0.6.0 migration + web.sock→admin.sock rename
D-10 closure on hero_researcher, paired with the first AI-rename ripple
absorb on a non-dead-code consumer (s117 hero_webbuilder was the first
ripple, but on a JobManager with 0 callers). Reference playbook:
memory/investigation_herolib_ai_migration.md.

Wholesale shape (s109/s111/s115 template, 0 service.toml on disk pre-sweep):

* Wrote 2 service.toml (crates/hero_researcher{,_server}/service.toml) both
  shipping [[env]] PATH_ROOT default="~/hero" per s107 lesson #17. Canonical
  2-binary inventory: cli + server (single-binary daemon binding two sockets
  rpc.sock + admin.sock — no separate _admin crate).

* Wired service_base!() + validate_service_toml + handle_info_flag triad on
  both main.rs files; server also calls print_startup_banner +
  prepare_sockets. Deleted ~250 LOC of hand-rolled print_startup_info /
  print_info_json / print_help blocks.

* Applied lesson #19 (s109+s110+s111+s112+s115): CLI build_service_definition
  threads PATH_ROOT/PATH_VAR/PATH_BUILD/PATH_CODE/HERO_SOCKET_DIR via
  forward_env_if_set into the spawned hero_proc_sdk ActionSpec. Replaced
  hand-rolled HERO_SOCKET_DIR/HOME fallback with herolib_core::base::
  resolve_socket_path() in both CLI and server.

* Paired web.sock → admin.sock rename (s112 hero_wallet + s115 hero_matrixchat
  precedent) in the same squash: 9 sites across CLI (3) + server (6) flipped.
  Server still binds two sockets in one process; the rename is socket-name-
  only, no crate split. Zero external consumers of the old web.sock name.

herolib_ai v0.6.0 migration (Lesson #21 4th shape — full API rewrite, per
playbook landed s117):

* Cargo.toml workspace deps: herolib_ai 0.5.0→0.6.0 + explicit
  features=["hero-proc"] (default-features=false would disable the
  HeroProcResolver); herolib_core 0.5.0→0.6.0; hero_proc_sdk 0.5.0→0.6.0.
  Cascade absorbed (cargo update).

* hero_researcher_lib: migrated 3 files (error.rs, job.rs, researcher.rs)
  from old broker-routed AiClient to new direct-HTTPS Arc<Provider>:
  - AiError → CompletionError
  - AiClient field → Arc<Provider> field
  - AiClient::default_socket().await → Arc::new(builtin::openrouter_provider())
  - ChatRequest/ChatCompletionResponse/Message types swapped to new
    completions::request/response/Message
  - chat_with(req).await → completions().model(...).message(...).send()
  - response.content().unwrap_or("") → response.text (5 sites)
  - response.usage.prompt_tokens/completion_tokens → input_tokens/output_tokens
    with Option<u64> unwrap (TokenUsage field rename + signedness change)
  - Async tokio::runtime::Handle::try_current() block_on dance removed
    entirely (new API is sync)
  - ResearchJob.ai_client(client) builder renamed to .provider(provider)
    (only doc-string consumer in workspace; no caller breakage)

* 10 examples that still import AiClient gated behind feature
  legacy_ai_examples (never enabled by default). Per-example [[example]]
  blocks with required-features="legacy_ai_examples" in
  hero_researcher_lib/Cargo.toml. Cleanest path while keeping
  cargo test --workspace green; per-example port is a future cleanup.

Dep audit:

* hero_researcher_lib: -anyhow -uuid -tokio (all zero-match after migration —
  old chat_sync used tokio::runtime, AiClient implied uuid + anyhow).
* hero_researcher_server: -tracing-subscriber (lib's init_logging owns the
  subscriber wiring; server doesn't import directly). +herolib_core (for
  service_base!() + resolve_socket_path).
* hero_researcher: +herolib_core (for service_base!() + forward_env_if_set
  + resolve_socket_path).

D-10 5/5 met:

1. service.toml present                       ✓ (2 files, both with
                                                  [[env]] PATH_ROOT)
2. service_base!() + canonical helpers        ✓ (3 main.rs)
3. lab infocheck clean                        ✓ 2 crates clean, 0 findings
4. lab service smoke 6/6                      ✓ rpc.sock 4/4 (health,
                                                  openrpc.json, well-known,
                                                  system.ping) + admin.sock
                                                  2/2 (health, well-known)
5. cargo test --workspace --release           ✓ 90/0/0 across 5 test
                                                  binaries + 2 doctests

PATH_ROOT verified in /proc/$pid/environ for the lab-spawned daemon (server
binds rpc.sock + admin.sock from one process).

Latent (out of D-10 scope, filed for follow-up):

* 10 examples broken behind legacy_ai_examples gate; per-example port to
  new Arc<Provider> API needed (lift gate, fix imports, re-enable).
* 1 integration test (tests/integration.rs) — left untouched, may need
  similar gating; check on next hero_researcher session.
* hero_researcher_server stop returns "stop timeout (30s)" but daemon DOES
  exit cleanly + sockets ARE removed (same OServer pattern as s110-s115).
* `has_ai()` placeholder pattern from playbook deferred — full hero_proc
  secret lookup at construction time is more honest than `true`.
* HeroContext fields context_id/claims/forwarded_prefix unused
  (pre-existing dead-code warning, not D-10 scope).

Tracker: hero_proc#102#33220 PATCHed; #105 hero_researcher ticked.
Reference playbook: memory/investigation_herolib_ai_migration.md (s117).
2026-05-18 19:19:49 -04:00
.hero chore: add hero_builder artifacts and fix dependency version specifiers 2026-05-10 14:23:44 +02:00
crates hero_proc#102 D-10 sweep: hero_researcher canonical service.toml + service_base!() + herolib_ai v0.6.0 migration + web.sock→admin.sock rename 2026-05-18 19:19:49 -04:00
.env.example Initial version 2026-02-16 12:44:13 +02:00
.gitignore feat: add 13 runnable examples, fix AiClient init, improve error messages 2026-03-20 20:50:30 +01:00
Cargo.lock hero_proc#102 D-10 sweep: hero_researcher canonical service.toml + service_base!() + herolib_ai v0.6.0 migration + web.sock→admin.sock rename 2026-05-18 19:19:49 -04:00
Cargo.toml hero_proc#102 D-10 sweep: hero_researcher canonical service.toml + service_base!() + herolib_ai v0.6.0 migration + web.sock→admin.sock rename 2026-05-18 19:19:49 -04:00
Cargo.toml.hero_builder_backup chore: add hero_builder artifacts and fix dependency version specifiers 2026-05-10 14:23:44 +02:00
docker-compose.yml rebranding to heroresearcher 2026-02-16 14:18:58 +02:00
PURPOSE.md fix: logging compliance, socket naming, add PURPOSE.md 2026-05-07 12:39:28 +02:00
README.md fix: remove Makefile and shell scripts, clean up README 2026-05-07 22:54:05 +02:00

Hero Researcher

AI-powered research assistant using multiple LLM models for comprehensive background research on people. Built in Rust with the herolib_ai client for multi-provider support.

Features

  • Multi-model queries - Query multiple AI models simultaneously for comprehensive results
  • Grounded research - Web search + scraping + LLM analysis with evidence-based reporting
  • 6 search providers - Brave, DuckDuckGo, SearXNG, Exa, Serper, SerpAPI
  • 8 platform scrapers - GitHub, LinkedIn, Twitter, Reddit, StackOverflow, Crunchbase, Medium, Facebook
  • 3 research tiers - Quick (4 models), Standard (8 models), Deep (18 models x 4 strategies)
  • Online Mode - Real-time web search for up-to-date information
  • Multiple output formats - Markdown, JSON, HTML reports
  • Web UI - Browser-based interface with job persistence
  • Audit logging - Full operation tracking in JSONL format

Installation

Use the Nushell service script to build and install:

service researcher start --update --reset

Quick Start

# Set your API key (at minimum, one provider)
export OPENROUTER_API_KEY=your_key_here

# Basic search
hero_researcher "John Doe" --location "San Francisco"

# Deep investigation (18 models x 4 strategies)
hero_researcher "John Doe" --beast

# Save to file
hero_researcher "John Doe" --output report.md

# With aliases and context
hero_researcher "John Doe" --aliases "J. Doe, Johnny" --context "software engineer at Acme Corp"

CLI Options

Option Description
NAME Name of the person to research (required)
--location Known location to narrow search
--aliases Comma-separated list of known aliases
--context Additional identifying context
--models Comma-separated list of custom model IDs
--tier Research tier: quick, standard, deep (default: standard)
--beast Alias for --tier deep
--format Output format: markdown, json, html (default: markdown)
--concurrency Parallel tasks 1-20 (default: 4)
--online Enable real-time web search (:online suffix)
--verbose Show detailed progress
--output Save report to file
--urls Seed URLs to scrape first (comma-separated)
--audit Write audit log to JSONL file

Hero OS Service (Nushell)

All lifecycle management is done via the Nushell service script:

# Install, register, and start via hero_proc
service researcher start --update --reset

# Stop gracefully
service researcher stop

# Check status
service researcher status

Sockets (under $HERO_SOCKET_DIR/hero_researcher/):

  • rpc.sock — JSON-RPC 2.0 API
  • web.sock — Browser admin dashboard

Web Server

# Start the server directly (binds rpc.sock and web.sock)
hero_researcher_server

API Endpoints:

  • GET / - Web UI
  • GET /api/tiers - Available research tiers
  • GET /api/jobs - List recent jobs
  • GET /api/jobs/:id - Get job details
  • POST /api/research - Start a research job

Research Tiers

Tier Models Strategies Search Queries Scrape Pages Synthesis
Quick 4 basic ~2 5 No
Standard 8 basic ~8 15 Yes
Deep 18 basic, deep, social, professional ~20 30 Yes

Environment Variables

Variable Description Required
OPENROUTER_API_KEY OpenRouter API key Yes (or other provider)
GROQ_API_KEY Groq API key Optional
SAMBANOVA_API_KEY SambaNova API key Optional
DEEPINFRA_API_KEY DeepInfra API key Optional
BRAVE_API_KEYS Brave Search keys (comma-separated) Optional
SEARXNG_URL SearXNG instance URL Optional
EXA_API_KEYS Exa Search keys Optional
SERPER_API_KEYS Serper keys Optional
SERPAPI_API_KEYS SerpAPI keys Optional
MODELS Override default model list Optional
DB_PATH SQLite database path (default: hero.db) Optional
LOG_LEVEL Log level (default: info) Optional

Project Structure

hero_researcher/
  Cargo.toml              # Workspace root
  crates/
    hero_researcher_lib/  # Core library
      src/
        lib.rs            # Re-exports
        types.rs          # Person, ResearchResult, etc.
        error.rs          # Error types
        config.rs         # Configuration
        researcher.rs     # Main research pipeline
        prompts.rs        # LLM prompt templates
        formatter.rs      # Report formatters
        audit.rs          # Audit logging
        logger.rs         # Tracing setup
        search/           # 6 search providers + aggregator
        scraper/          # Web scraper + 8 platform scrapers
      examples/           # Runnable examples (search, scrape, LLM, pipeline)
    hero_researcher/      # CLI binary
    hero_researcher_server/ # Web server + SQLite

Examples

13 runnable examples in crates/hero_researcher_lib/examples/. Run via cargo:

cargo run -p hero_researcher_lib --example search
cargo run -p hero_researcher_lib --example scrape
cargo run -p hero_researcher_lib --example formatter

Requires LLM API key

cargo run -p hero_researcher_lib --example ai_client
cargo run -p hero_researcher_lib --example llm_call
cargo run -p hero_researcher_lib --example search_and_summarize
cargo run -p hero_researcher_lib --example research_quick
cargo run -p hero_researcher_lib --example researcher_stream
cargo run -p hero_researcher_lib --example online_search
cargo run -p hero_researcher_lib --example custom_models
cargo run -p hero_researcher_lib --example beast_mode
cargo run -p hero_researcher_lib --example audit_log
cargo run -p hero_researcher_lib --example seed_urls

Development

cargo check --workspace    # Type check
cargo test --workspace     # Run tests
cargo clippy --workspace   # Clippy lints
cargo fmt --all            # Format code
cargo build --workspace --release  # Build release

API Reference

AiClient

use herolib_ai::{AiClient, Model, PromptBuilder, Provider};

let client = AiClient::from_env();

// Using a built-in model
let response = PromptBuilder::new(&client)
    .model(Model::Llama3_1_8B)
    .system("You are a researcher.")
    .user("Search for...")
    .execute()?;
println!("{}", response.content().unwrap_or("(empty)"));

// Using any model by raw ID (e.g. any OpenRouter model)
let response = PromptBuilder::new(&client)
    .raw_model(Provider::OpenRouter, "anthropic/claude-sonnet-4.5")
    .system("You are a researcher.")
    .user("Search for...")
    .execute()?;

Researcher

use hero_researcher_lib::config::Config;
use hero_researcher_lib::researcher::Researcher;
use hero_researcher_lib::scraper::Scraper;
use hero_researcher_lib::search::factory::build_search_provider;
use hero_researcher_lib::types::*;
use herolib_ai::AiClient;

let config = Config::from_env();
let ai_client = AiClient::from_env();

let researcher = Researcher::new(ai_client, &config)
    .with_tier(ResearchTier::Quick)
    .with_concurrency(2)
    .with_search_provider(build_search_provider(&config))
    .with_scraper(Scraper::new())
    .with_on_result(Box::new(|result| {
        println!("{}: {}", result.model, result.success);
    }));

let person = Person {
    name: "John Doe".into(),
    location: Some("New York".into()),
    aliases: vec!["J. Doe".into()],
    context: None,
};

let results = researcher.research(&person)?;

format_report

use hero_researcher_lib::formatter::format_report;
use hero_researcher_lib::types::*;

let report = format_report("John Doe", &results, OutputFormat::Markdown, &metadata);
println!("{}", report);

Disclaimer

This tool uses AI models to search for publicly available information.

  • Results may be inaccurate, outdated, or hallucinated
  • This is NOT an official background check
  • Always verify information from official sources
  • Respect privacy and applicable laws

License

Apache-2.0