AI-powered research assistant using multiple LLMs, web search, and scraping.

Rust 89.1%
HTML 10.9%

Find a file

despiegk 938739fa29 Some checks failed lab publish / publish (push) Failing after 31s Details chore: remove hard version pinning from hero_* dependencies		2026-06-06 08:37:15 +02:00
.forgejo/workflows	ci: install lab onto PATH from ~/.local/bin in the publish workflow	2026-05-29 15:38:32 -04:00
.hero	chore: rename HERO_SOCKET_DIR→PATH_SOCKET, FORGEJO_TOKEN→FORGE_TOKEN, drop .env.example, normalize markdown whitespace	2026-05-29 20:44:19 +02:00
crates	Switch service_base macro source so BUILD_NR is defined	2026-06-03 10:22:46 +02:00
schema	chore: auto-commit local changes before pull	2026-05-31 23:45:31 +02:00
.gitignore	chore: ignore Cargo.lock files	2026-06-06 07:54:41 +02:00
Cargo.toml	chore: remove hard version pinning from hero_* dependencies	2026-06-06 08:37:15 +02:00
Cargo.toml.hero_builder_backup	chore: add hero_builder artifacts and fix dependency version specifiers	2026-05-10 14:23:44 +02:00
docker-compose.yml	rebranding to heroresearcher	2026-02-16 14:18:58 +02:00
PURPOSE.md	chore: rename HERO_SOCKET_DIR→PATH_SOCKET, FORGEJO_TOKEN→FORGE_TOKEN, drop .env.example, normalize markdown whitespace	2026-05-29 20:44:19 +02:00
README.md	chore: rename HERO_SOCKET_DIR→PATH_SOCKET, FORGEJO_TOKEN→FORGE_TOKEN, drop .env.example, normalize markdown whitespace	2026-05-29 20:44:19 +02:00

README.md

Hero Researcher

AI-powered research assistant using multiple LLM models for comprehensive background research on people. Built in Rust with the herolib_ai client for multi-provider support.

Features

Multi-model queries - Query multiple AI models simultaneously for comprehensive results
Grounded research - Web search + scraping + LLM analysis with evidence-based reporting
6 search providers - Brave, DuckDuckGo, SearXNG, Exa, Serper, SerpAPI
8 platform scrapers - GitHub, LinkedIn, Twitter, Reddit, StackOverflow, Crunchbase, Medium, Facebook
3 research tiers - Quick (4 models), Standard (8 models), Deep (18 models x 4 strategies)
Online Mode - Real-time web search for up-to-date information
Multiple output formats - Markdown, JSON, HTML reports
Web UI - Browser-based interface with job persistence
Audit logging - Full operation tracking in JSONL format

Installation

Use the Nushell service script to build and install:

service researcher start --update --reset

Quick Start

# Set your API key (at minimum, one provider)
export OPENROUTER_API_KEY=your_key_here

# Basic search
hero_researcher "John Doe" --location "San Francisco"

# Deep investigation (18 models x 4 strategies)
hero_researcher "John Doe" --beast

# Save to file
hero_researcher "John Doe" --output report.md

# With aliases and context
hero_researcher "John Doe" --aliases "J. Doe, Johnny" --context "software engineer at Acme Corp"

CLI Options

Option	Description
`NAME`	Name of the person to research (required)
`--location`	Known location to narrow search
`--aliases`	Comma-separated list of known aliases
`--context`	Additional identifying context
`--models`	Comma-separated list of custom model IDs
`--tier`	Research tier: quick, standard, deep (default: standard)
`--beast`	Alias for `--tier deep`
`--format`	Output format: markdown, json, html (default: markdown)
`--concurrency`	Parallel tasks 1-20 (default: 4)
`--online`	Enable real-time web search (`:online` suffix)
`--verbose`	Show detailed progress
`--output`	Save report to file
`--urls`	Seed URLs to scrape first (comma-separated)
`--audit`	Write audit log to JSONL file

Hero OS Service (Nushell)

All lifecycle management is done via the Nushell service script:

# Install, register, and start via hero_proc
service researcher start --update --reset

# Stop gracefully
service researcher stop

# Check status
service researcher status

Sockets (under $PATH_SOCKET/hero_researcher/):

rpc.sock — JSON-RPC 2.0 API
web.sock — Browser admin dashboard

Web Server

# Start the server directly (binds rpc.sock and web.sock)
hero_researcher_server

API Endpoints:

GET / - Web UI
GET /api/tiers - Available research tiers
GET /api/jobs - List recent jobs
GET /api/jobs/:id - Get job details
POST /api/research - Start a research job

Research Tiers

Tier	Models	Strategies	Search Queries	Scrape Pages	Synthesis
Quick	4	basic	~2	5	No
Standard	8	basic	~8	15	Yes
Deep	18	basic, deep, social, professional	~20	30	Yes

Environment Variables

Variable	Description	Required
`OPENROUTER_API_KEY`	OpenRouter API key	Yes (or other provider)
`GROQ_API_KEY`	Groq API key	Optional
`SAMBANOVA_API_KEY`	SambaNova API key	Optional
`DEEPINFRA_API_KEY`	DeepInfra API key	Optional
`BRAVE_API_KEYS`	Brave Search keys (comma-separated)	Optional
`SEARXNG_URL`	SearXNG instance URL	Optional
`EXA_API_KEYS`	Exa Search keys	Optional
`SERPER_API_KEYS`	Serper keys	Optional
`SERPAPI_API_KEYS`	SerpAPI keys	Optional
`MODELS`	Override default model list	Optional
`DB_PATH`	SQLite database path (default: hero.db)	Optional
`LOG_LEVEL`	Log level (default: info)	Optional

Project Structure

hero_researcher/
 Cargo.toml # Workspace root
 crates/
 hero_researcher_lib/ # Core library
 src/
 lib.rs # Re-exports
 types.rs # Person, ResearchResult, etc.
 error.rs # Error types
 config.rs # Configuration
 researcher.rs # Main research pipeline
 prompts.rs # LLM prompt templates
 formatter.rs # Report formatters
 audit.rs # Audit logging
 logger.rs # Tracing setup
 search/ # 6 search providers + aggregator
 scraper/ # Web scraper + 8 platform scrapers
 examples/ # Runnable examples (search, scrape, LLM, pipeline)
 hero_researcher/ # CLI binary
 hero_researcher_server/ # Web server + SQLite

Examples

13 runnable examples in crates/hero_researcher_lib/examples/. Run via cargo:

cargo run -p hero_researcher_lib --example search
cargo run -p hero_researcher_lib --example scrape
cargo run -p hero_researcher_lib --example formatter

Requires LLM API key

cargo run -p hero_researcher_lib --example ai_client
cargo run -p hero_researcher_lib --example llm_call
cargo run -p hero_researcher_lib --example search_and_summarize
cargo run -p hero_researcher_lib --example research_quick
cargo run -p hero_researcher_lib --example researcher_stream
cargo run -p hero_researcher_lib --example online_search
cargo run -p hero_researcher_lib --example custom_models
cargo run -p hero_researcher_lib --example beast_mode
cargo run -p hero_researcher_lib --example audit_log
cargo run -p hero_researcher_lib --example seed_urls

Development

cargo check --workspace # Type check
cargo test --workspace # Run tests
cargo clippy --workspace # Clippy lints
cargo fmt --all # Format code
cargo build --workspace --release # Build release

API Reference

`AiClient`

use herolib_ai::{AiClient, Model, PromptBuilder, Provider};

let client = AiClient::from_env();

// Using a built-in model
let response = PromptBuilder::new(&client)
 .model(Model::Llama3_1_8B)
 .system("You are a researcher.")
 .user("Search for...")
 .execute()?;
println!("{}", response.content().unwrap_or("(empty)"));

// Using any model by raw ID (e.g. any OpenRouter model)
let response = PromptBuilder::new(&client)
 .raw_model(Provider::OpenRouter, "anthropic/claude-sonnet-4.5")
 .system("You are a researcher.")
 .user("Search for...")
 .execute()?;

`Researcher`

use hero_researcher_lib::config::Config;
use hero_researcher_lib::researcher::Researcher;
use hero_researcher_lib::scraper::Scraper;
use hero_researcher_lib::search::factory::build_search_provider;
use hero_researcher_lib::types::*;
use herolib_ai::AiClient;

let config = Config::from_env();
let ai_client = AiClient::from_env();

let researcher = Researcher::new(ai_client, &config)
 .with_tier(ResearchTier::Quick)
 .with_concurrency(2)
 .with_search_provider(build_search_provider(&config))
 .with_scraper(Scraper::new())
 .with_on_result(Box::new(|result| {
 println!("{}: {}", result.model, result.success);
 }));

let person = Person {
 name: "John Doe".into(),
 location: Some("New York".into()),
 aliases: vec!["J. Doe".into()],
 context: None,
};

let results = researcher.research(&person)?;

`format_report`

use hero_researcher_lib::formatter::format_report;
use hero_researcher_lib::types::*;

let report = format_report("John Doe", &results, OutputFormat::Markdown, &metadata);
println!("{}", report);

Disclaimer

This tool uses AI models to search for publicly available information.

Results may be inaccurate, outdated, or hallucinated
This is NOT an official background check
Always verify information from official sources
Respect privacy and applicable laws

License

Apache-2.0