No description
  • Rust 70.1%
  • Shell 19%
  • HTML 8.8%
  • Makefile 2.1%
Find a file
despiegk 8c34660053 fix: split hero_researcher_server into separate rpc.sock and ui.sock
The server previously served both HTML UI and JSON-RPC on the same
rpc.sock, violating the Hero socket strategy. This splits it into:
- rpc.sock: JSON-RPC 2.0, /openrpc.json, /health, /discovery only
- ui.sock: HTML admin dashboard, /api/* routes, /health, /discovery only

Also adds ui.sock to kill_other in the CLI service definition so both
sockets are cleaned up on restart.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-07 08:37:31 +02:00
crates fix: split hero_researcher_server into separate rpc.sock and ui.sock 2026-04-07 08:37:31 +02:00
scripts Add build_lib integration: scripts/build_lib.sh, fix buildenv.sh, update Makefile 2026-04-05 06:52:12 +02:00
.env.example Initial version 2026-02-16 12:44:13 +02:00
.gitignore feat: add 13 runnable examples, fix AiClient init, improve error messages 2026-03-20 20:50:30 +01:00
buildenv.sh Add build_lib integration: scripts/build_lib.sh, fix buildenv.sh, update Makefile 2026-04-05 06:52:12 +02:00
Cargo.lock Fix hero_sockets compliance: use service name 'hero_researcher' (not binary name) for socket directory, respect HERO_SOCKET_DIR in CLI service definition, update cargo dependencies 2026-04-06 12:47:16 +02:00
Cargo.toml Fix hero_sockets compliance: use service name 'hero_researcher' (not binary name) for socket directory, respect HERO_SOCKET_DIR in CLI service definition, update cargo dependencies 2026-04-06 12:47:16 +02:00
docker-compose.yml rebranding to heroresearcher 2026-02-16 14:18:58 +02:00
Makefile Fix hero_sockets compliance: use service name 'hero_researcher' (not binary name) for socket directory, respect HERO_SOCKET_DIR in CLI service definition, update cargo dependencies 2026-04-06 12:47:16 +02:00
README.md feat: add 13 runnable examples, fix AiClient init, improve error messages 2026-03-20 20:50:30 +01:00

Hero Researcher

AI-powered research assistant using multiple LLM models for comprehensive background research on people. Built in Rust with the herolib_ai client for multi-provider support.

Features

  • Multi-model queries - Query multiple AI models simultaneously for comprehensive results
  • Grounded research - Web search + scraping + LLM analysis with evidence-based reporting
  • 6 search providers - Brave, DuckDuckGo, SearXNG, Exa, Serper, SerpAPI
  • 8 platform scrapers - GitHub, LinkedIn, Twitter, Reddit, StackOverflow, Crunchbase, Medium, Facebook
  • 3 research tiers - Quick (4 models), Standard (8 models), Deep (18 models x 4 strategies)
  • Online Mode - Real-time web search for up-to-date information
  • Multiple output formats - Markdown, JSON, HTML reports
  • Web UI - Browser-based interface with job persistence
  • Audit logging - Full operation tracking in JSONL format

Installation

# Build
make build

# Install to ~/.cargo/bin
make install

Quick Start

# Set your API key (at minimum, one provider)
export OPENROUTER_API_KEY=your_key_here

# Basic search
hero_researcher "John Doe" --location "San Francisco"

# Deep investigation (18 models x 4 strategies)
hero_researcher "John Doe" --beast

# Save to file
hero_researcher "John Doe" --output report.md

# With aliases and context
hero_researcher "John Doe" --aliases "J. Doe, Johnny" --context "software engineer at Acme Corp"

CLI Options

Option Description
NAME Name of the person to research (required)
--location Known location to narrow search
--aliases Comma-separated list of known aliases
--context Additional identifying context
--models Comma-separated list of custom model IDs
--tier Research tier: quick, standard, deep (default: standard)
--beast Alias for --tier deep
--format Output format: markdown, json, html (default: markdown)
--concurrency Parallel tasks 1-20 (default: 4)
--online Enable real-time web search (:online suffix)
--verbose Show detailed progress
--output Save report to file
--urls Seed URLs to scrape first (comma-separated)
--audit Write audit log to JSONL file

Web Server

# Start the web server (default port 3000)
hero_researcher_server

# Custom port
PORT=8080 hero_researcher_server

API Endpoints:

  • GET / - Web UI
  • GET /api/tiers - Available research tiers
  • GET /api/jobs - List recent jobs
  • GET /api/jobs/:id - Get job details
  • POST /api/research - Start a research job

Research Tiers

Tier Models Strategies Search Queries Scrape Pages Synthesis
Quick 4 basic ~2 5 No
Standard 8 basic ~8 15 Yes
Deep 18 basic, deep, social, professional ~20 30 Yes

Environment Variables

Variable Description Required
OPENROUTER_API_KEY OpenRouter API key Yes (or other provider)
GROQ_API_KEY Groq API key Optional
SAMBANOVA_API_KEY SambaNova API key Optional
DEEPINFRA_API_KEY DeepInfra API key Optional
BRAVE_API_KEYS Brave Search keys (comma-separated) Optional
SEARXNG_URL SearXNG instance URL Optional
EXA_API_KEYS Exa Search keys Optional
SERPER_API_KEYS Serper keys Optional
SERPAPI_API_KEYS SerpAPI keys Optional
MODELS Override default model list Optional
DB_PATH SQLite database path (default: hero.db) Optional
PORT Web server port (default: 3000) Optional
LOG_LEVEL Log level (default: info) Optional

Project Structure

hero_researcher/
  Cargo.toml              # Workspace root
  Makefile                # Build targets
  buildenv.sh             # Project identity
  .env.example            # Environment variable template
  crates/
    hero_researcher_lib/  # Core library
      src/
        lib.rs            # Re-exports
        types.rs          # Person, ResearchResult, etc.
        error.rs          # Error types
        config.rs         # Configuration
        researcher.rs     # Main research pipeline
        prompts.rs        # LLM prompt templates
        formatter.rs      # Report formatters
        audit.rs          # Audit logging
        logger.rs         # Tracing setup
        search/           # 6 search providers + aggregator
        scraper/          # Web scraper + 8 platform scrapers
      examples/           # Runnable examples (search, scrape, LLM, pipeline)
    hero_researcher/      # CLI binary
    hero_researcher_server/ # Web server + SQLite

Examples

13 runnable examples in crates/hero_researcher_lib/examples/. Run via make or cargo run -p hero_researcher_lib --example <name>.

No API key needed

make example-search      # DuckDuckGo web search
make example-scrape      # Scrape a web page and extract text
make example-formatter   # Format mock results to markdown/json/html

Requires LLM API key

make example-ai-client      # Direct AI client chat
make example-llm            # LLM call with token usage metadata
make example-summarize      # Search + scrape + LLM summary
make example-research       # Quick-tier research pipeline
make example-stream         # Research with streaming result callback
make example-online         # Online mode (grounded search + scrape + LLM)
make example-custom-models  # Research with custom model list
make example-beast          # Beast mode (deep tier, 18 models x 4 strategies)
make example-audit          # Research with audit log output
make example-seed-urls      # Research with pre-seeded URLs
make examples  # Run all 13 examples sequentially

Save all output to a directory

make examples-save                    # saves to results/
make examples-save OUT=my_output      # custom output directory
make examples-save ARGS="Elon Musk"   # custom person for research examples

Each example writes to its own .txt file (e.g. results/search.txt, results/beast_mode.txt).

Pass custom input

make example-scrape ARGS="https://example.com"
make example-summarize ARGS="climate change solutions"
make example-research ARGS="Linus Torvalds"
make example-ai-client ARGS="google/gemini-2.5-flash"
make example-formatter ARGS="json"

API Reference

AiClient

use herolib_ai::{AiClient, Model, PromptBuilder, Provider};

let client = AiClient::from_env();

// Using a built-in model
let response = PromptBuilder::new(&client)
    .model(Model::Llama3_1_8B)
    .system("You are a researcher.")
    .user("Search for...")
    .execute()?;
println!("{}", response.content().unwrap_or("(empty)"));

// Using any model by raw ID (e.g. any OpenRouter model)
let response = PromptBuilder::new(&client)
    .raw_model(Provider::OpenRouter, "anthropic/claude-sonnet-4.5")
    .system("You are a researcher.")
    .user("Search for...")
    .execute()?;

Researcher

use hero_researcher_lib::config::Config;
use hero_researcher_lib::researcher::Researcher;
use hero_researcher_lib::scraper::Scraper;
use hero_researcher_lib::search::factory::build_search_provider;
use hero_researcher_lib::types::*;
use herolib_ai::AiClient;

let config = Config::from_env();
let ai_client = AiClient::from_env();

let researcher = Researcher::new(ai_client, &config)
    .with_tier(ResearchTier::Quick)
    .with_concurrency(2)
    .with_search_provider(build_search_provider(&config))
    .with_scraper(Scraper::new())
    .with_on_result(Box::new(|result| {
        println!("{}: {}", result.model, result.success);
    }));

let person = Person {
    name: "John Doe".into(),
    location: Some("New York".into()),
    aliases: vec!["J. Doe".into()],
    context: None,
};

let results = researcher.research(&person)?;

format_report

use hero_researcher_lib::formatter::format_report;
use hero_researcher_lib::types::*;

let report = format_report("John Doe", &results, OutputFormat::Markdown, &metadata);
println!("{}", report);

Development

make check    # Type check
make test     # Run tests
make lint     # Clippy lints
make fmt      # Format code
make build    # Build release

Disclaimer

This tool uses AI models to search for publicly available information.

  • Results may be inaccurate, outdated, or hallucinated
  • This is NOT an official background check
  • Always verify information from official sources
  • Respect privacy and applicable laws

License

Apache-2.0