No description
- Rust 89.1%
- HTML 10.9%
D-10 closure on hero_researcher, paired with the first AI-rename ripple
absorb on a non-dead-code consumer (s117 hero_webbuilder was the first
ripple, but on a JobManager with 0 callers). Reference playbook:
memory/investigation_herolib_ai_migration.md.
Wholesale shape (s109/s111/s115 template, 0 service.toml on disk pre-sweep):
* Wrote 2 service.toml (crates/hero_researcher{,_server}/service.toml) both
shipping [[env]] PATH_ROOT default="~/hero" per s107 lesson #17. Canonical
2-binary inventory: cli + server (single-binary daemon binding two sockets
rpc.sock + admin.sock — no separate _admin crate).
* Wired service_base!() + validate_service_toml + handle_info_flag triad on
both main.rs files; server also calls print_startup_banner +
prepare_sockets. Deleted ~250 LOC of hand-rolled print_startup_info /
print_info_json / print_help blocks.
* Applied lesson #19 (s109+s110+s111+s112+s115): CLI build_service_definition
threads PATH_ROOT/PATH_VAR/PATH_BUILD/PATH_CODE/HERO_SOCKET_DIR via
forward_env_if_set into the spawned hero_proc_sdk ActionSpec. Replaced
hand-rolled HERO_SOCKET_DIR/HOME fallback with herolib_core::base::
resolve_socket_path() in both CLI and server.
* Paired web.sock → admin.sock rename (s112 hero_wallet + s115 hero_matrixchat
precedent) in the same squash: 9 sites across CLI (3) + server (6) flipped.
Server still binds two sockets in one process; the rename is socket-name-
only, no crate split. Zero external consumers of the old web.sock name.
herolib_ai v0.6.0 migration (Lesson #21 4th shape — full API rewrite, per
playbook landed s117):
* Cargo.toml workspace deps: herolib_ai 0.5.0→0.6.0 + explicit
features=["hero-proc"] (default-features=false would disable the
HeroProcResolver); herolib_core 0.5.0→0.6.0; hero_proc_sdk 0.5.0→0.6.0.
Cascade absorbed (cargo update).
* hero_researcher_lib: migrated 3 files (error.rs, job.rs, researcher.rs)
from old broker-routed AiClient to new direct-HTTPS Arc<Provider>:
- AiError → CompletionError
- AiClient field → Arc<Provider> field
- AiClient::default_socket().await → Arc::new(builtin::openrouter_provider())
- ChatRequest/ChatCompletionResponse/Message types swapped to new
completions::request/response/Message
- chat_with(req).await → completions().model(...).message(...).send()
- response.content().unwrap_or("") → response.text (5 sites)
- response.usage.prompt_tokens/completion_tokens → input_tokens/output_tokens
with Option<u64> unwrap (TokenUsage field rename + signedness change)
- Async tokio::runtime::Handle::try_current() block_on dance removed
entirely (new API is sync)
- ResearchJob.ai_client(client) builder renamed to .provider(provider)
(only doc-string consumer in workspace; no caller breakage)
* 10 examples that still import AiClient gated behind feature
legacy_ai_examples (never enabled by default). Per-example [[example]]
blocks with required-features="legacy_ai_examples" in
hero_researcher_lib/Cargo.toml. Cleanest path while keeping
cargo test --workspace green; per-example port is a future cleanup.
Dep audit:
* hero_researcher_lib: -anyhow -uuid -tokio (all zero-match after migration —
old chat_sync used tokio::runtime, AiClient implied uuid + anyhow).
* hero_researcher_server: -tracing-subscriber (lib's init_logging owns the
subscriber wiring; server doesn't import directly). +herolib_core (for
service_base!() + resolve_socket_path).
* hero_researcher: +herolib_core (for service_base!() + forward_env_if_set
+ resolve_socket_path).
D-10 5/5 met:
1. service.toml present ✓ (2 files, both with
[[env]] PATH_ROOT)
2. service_base!() + canonical helpers ✓ (3 main.rs)
3. lab infocheck clean ✓ 2 crates clean, 0 findings
4. lab service smoke 6/6 ✓ rpc.sock 4/4 (health,
openrpc.json, well-known,
system.ping) + admin.sock
2/2 (health, well-known)
5. cargo test --workspace --release ✓ 90/0/0 across 5 test
binaries + 2 doctests
PATH_ROOT verified in /proc/$pid/environ for the lab-spawned daemon (server
binds rpc.sock + admin.sock from one process).
Latent (out of D-10 scope, filed for follow-up):
* 10 examples broken behind legacy_ai_examples gate; per-example port to
new Arc<Provider> API needed (lift gate, fix imports, re-enable).
* 1 integration test (tests/integration.rs) — left untouched, may need
similar gating; check on next hero_researcher session.
* hero_researcher_server stop returns "stop timeout (30s)" but daemon DOES
exit cleanly + sockets ARE removed (same OServer pattern as s110-s115).
* `has_ai()` placeholder pattern from playbook deferred — full hero_proc
secret lookup at construction time is more honest than `true`.
* HeroContext fields context_id/claims/forwarded_prefix unused
(pre-existing dead-code warning, not D-10 scope).
Tracker: hero_proc#102#33220 PATCHed; #105 hero_researcher ticked.
Reference playbook: memory/investigation_herolib_ai_migration.md (s117).
|
||
|---|---|---|
| .hero | ||
| crates | ||
| .env.example | ||
| .gitignore | ||
| Cargo.lock | ||
| Cargo.toml | ||
| Cargo.toml.hero_builder_backup | ||
| docker-compose.yml | ||
| PURPOSE.md | ||
| README.md | ||
Hero Researcher
AI-powered research assistant using multiple LLM models for comprehensive background research on people. Built in Rust with the herolib_ai client for multi-provider support.
Features
- Multi-model queries - Query multiple AI models simultaneously for comprehensive results
- Grounded research - Web search + scraping + LLM analysis with evidence-based reporting
- 6 search providers - Brave, DuckDuckGo, SearXNG, Exa, Serper, SerpAPI
- 8 platform scrapers - GitHub, LinkedIn, Twitter, Reddit, StackOverflow, Crunchbase, Medium, Facebook
- 3 research tiers - Quick (4 models), Standard (8 models), Deep (18 models x 4 strategies)
- Online Mode - Real-time web search for up-to-date information
- Multiple output formats - Markdown, JSON, HTML reports
- Web UI - Browser-based interface with job persistence
- Audit logging - Full operation tracking in JSONL format
Installation
Use the Nushell service script to build and install:
service researcher start --update --reset
Quick Start
# Set your API key (at minimum, one provider)
export OPENROUTER_API_KEY=your_key_here
# Basic search
hero_researcher "John Doe" --location "San Francisco"
# Deep investigation (18 models x 4 strategies)
hero_researcher "John Doe" --beast
# Save to file
hero_researcher "John Doe" --output report.md
# With aliases and context
hero_researcher "John Doe" --aliases "J. Doe, Johnny" --context "software engineer at Acme Corp"
CLI Options
| Option | Description |
|---|---|
NAME |
Name of the person to research (required) |
--location |
Known location to narrow search |
--aliases |
Comma-separated list of known aliases |
--context |
Additional identifying context |
--models |
Comma-separated list of custom model IDs |
--tier |
Research tier: quick, standard, deep (default: standard) |
--beast |
Alias for --tier deep |
--format |
Output format: markdown, json, html (default: markdown) |
--concurrency |
Parallel tasks 1-20 (default: 4) |
--online |
Enable real-time web search (:online suffix) |
--verbose |
Show detailed progress |
--output |
Save report to file |
--urls |
Seed URLs to scrape first (comma-separated) |
--audit |
Write audit log to JSONL file |
Hero OS Service (Nushell)
All lifecycle management is done via the Nushell service script:
# Install, register, and start via hero_proc
service researcher start --update --reset
# Stop gracefully
service researcher stop
# Check status
service researcher status
Sockets (under $HERO_SOCKET_DIR/hero_researcher/):
rpc.sock— JSON-RPC 2.0 APIweb.sock— Browser admin dashboard
Web Server
# Start the server directly (binds rpc.sock and web.sock)
hero_researcher_server
API Endpoints:
GET /- Web UIGET /api/tiers- Available research tiersGET /api/jobs- List recent jobsGET /api/jobs/:id- Get job detailsPOST /api/research- Start a research job
Research Tiers
| Tier | Models | Strategies | Search Queries | Scrape Pages | Synthesis |
|---|---|---|---|---|---|
| Quick | 4 | basic | ~2 | 5 | No |
| Standard | 8 | basic | ~8 | 15 | Yes |
| Deep | 18 | basic, deep, social, professional | ~20 | 30 | Yes |
Environment Variables
| Variable | Description | Required |
|---|---|---|
OPENROUTER_API_KEY |
OpenRouter API key | Yes (or other provider) |
GROQ_API_KEY |
Groq API key | Optional |
SAMBANOVA_API_KEY |
SambaNova API key | Optional |
DEEPINFRA_API_KEY |
DeepInfra API key | Optional |
BRAVE_API_KEYS |
Brave Search keys (comma-separated) | Optional |
SEARXNG_URL |
SearXNG instance URL | Optional |
EXA_API_KEYS |
Exa Search keys | Optional |
SERPER_API_KEYS |
Serper keys | Optional |
SERPAPI_API_KEYS |
SerpAPI keys | Optional |
MODELS |
Override default model list | Optional |
DB_PATH |
SQLite database path (default: hero.db) | Optional |
LOG_LEVEL |
Log level (default: info) | Optional |
Project Structure
hero_researcher/
Cargo.toml # Workspace root
crates/
hero_researcher_lib/ # Core library
src/
lib.rs # Re-exports
types.rs # Person, ResearchResult, etc.
error.rs # Error types
config.rs # Configuration
researcher.rs # Main research pipeline
prompts.rs # LLM prompt templates
formatter.rs # Report formatters
audit.rs # Audit logging
logger.rs # Tracing setup
search/ # 6 search providers + aggregator
scraper/ # Web scraper + 8 platform scrapers
examples/ # Runnable examples (search, scrape, LLM, pipeline)
hero_researcher/ # CLI binary
hero_researcher_server/ # Web server + SQLite
Examples
13 runnable examples in crates/hero_researcher_lib/examples/. Run via cargo:
cargo run -p hero_researcher_lib --example search
cargo run -p hero_researcher_lib --example scrape
cargo run -p hero_researcher_lib --example formatter
Requires LLM API key
cargo run -p hero_researcher_lib --example ai_client
cargo run -p hero_researcher_lib --example llm_call
cargo run -p hero_researcher_lib --example search_and_summarize
cargo run -p hero_researcher_lib --example research_quick
cargo run -p hero_researcher_lib --example researcher_stream
cargo run -p hero_researcher_lib --example online_search
cargo run -p hero_researcher_lib --example custom_models
cargo run -p hero_researcher_lib --example beast_mode
cargo run -p hero_researcher_lib --example audit_log
cargo run -p hero_researcher_lib --example seed_urls
Development
cargo check --workspace # Type check
cargo test --workspace # Run tests
cargo clippy --workspace # Clippy lints
cargo fmt --all # Format code
cargo build --workspace --release # Build release
API Reference
AiClient
use herolib_ai::{AiClient, Model, PromptBuilder, Provider};
let client = AiClient::from_env();
// Using a built-in model
let response = PromptBuilder::new(&client)
.model(Model::Llama3_1_8B)
.system("You are a researcher.")
.user("Search for...")
.execute()?;
println!("{}", response.content().unwrap_or("(empty)"));
// Using any model by raw ID (e.g. any OpenRouter model)
let response = PromptBuilder::new(&client)
.raw_model(Provider::OpenRouter, "anthropic/claude-sonnet-4.5")
.system("You are a researcher.")
.user("Search for...")
.execute()?;
Researcher
use hero_researcher_lib::config::Config;
use hero_researcher_lib::researcher::Researcher;
use hero_researcher_lib::scraper::Scraper;
use hero_researcher_lib::search::factory::build_search_provider;
use hero_researcher_lib::types::*;
use herolib_ai::AiClient;
let config = Config::from_env();
let ai_client = AiClient::from_env();
let researcher = Researcher::new(ai_client, &config)
.with_tier(ResearchTier::Quick)
.with_concurrency(2)
.with_search_provider(build_search_provider(&config))
.with_scraper(Scraper::new())
.with_on_result(Box::new(|result| {
println!("{}: {}", result.model, result.success);
}));
let person = Person {
name: "John Doe".into(),
location: Some("New York".into()),
aliases: vec!["J. Doe".into()],
context: None,
};
let results = researcher.research(&person)?;
format_report
use hero_researcher_lib::formatter::format_report;
use hero_researcher_lib::types::*;
let report = format_report("John Doe", &results, OutputFormat::Markdown, &metadata);
println!("{}", report);
Disclaimer
This tool uses AI models to search for publicly available information.
- Results may be inaccurate, outdated, or hallucinated
- This is NOT an official background check
- Always verify information from official sources
- Respect privacy and applicable laws
License
Apache-2.0