10 KiB
HeroDB Examples
This directory contains examples demonstrating HeroDB's capabilities including full-text search powered by Tantivy and vector database operations using Lance.
Available Examples
- Tantivy Search Demo - Full-text search capabilities
- Lance Vector Database Demo - Vector database and AI operations
- AGE Encryption Demo - Cryptographic operations
- Simple Demo - Basic Redis operations
Lance Vector Database Demo (Bash Script)
Overview
The lance_vector_demo.sh
script provides a comprehensive demonstration of HeroDB's vector database capabilities using Lance. It showcases vector storage, similarity search, multimodal data handling, and AI-powered operations with external embedding services.
Prerequisites
- HeroDB Server: The server must be running (default port 6379)
- Redis CLI: The
redis-cli
tool must be installed and available in your PATH - Embedding Service (optional): For full functionality, set up an external embedding service
Running the Demo
Step 1: Start HeroDB Server
# From the project root directory
cargo run -- --dir ./test_data --port 6379
Step 2: Run the Demo (in a new terminal)
# From the project root directory
./examples/lance_vector_demo.sh
What the Demo Covers
The script demonstrates comprehensive vector database operations:
-
Dataset Management
- Creating vector datasets with custom dimensions
- Defining schemas with metadata fields
- Listing and inspecting datasets
- Dataset information and statistics
-
Embedding Operations
- Text embedding generation via external services
- Multimodal embedding support (text + images)
- Batch embedding operations
-
Data Storage
- Storing text documents with automatic embedding
- Storing images with metadata
- Multimodal content storage
- Rich metadata support
-
Vector Search
- Similarity search with raw vectors
- Text-based semantic search
- Configurable search parameters (K, NPROBES, REFINE)
- Cross-modal search capabilities
-
Index Management
- Creating IVF_PQ indexes for performance
- Custom index parameters
- Performance optimization
-
Advanced Features
- Error handling and recovery
- Performance testing concepts
- Monitoring and maintenance
- Cleanup operations
Key Lance Commands Demonstrated
Dataset Management
# Create vector dataset
LANCE CREATE documents DIM 384
# Create dataset with schema
LANCE CREATE products DIM 768 SCHEMA category:string price:float available:bool
# List datasets
LANCE LIST
# Get dataset information
LANCE INFO documents
Data Operations
# Store text with metadata
LANCE STORE documents TEXT "Machine learning tutorial" category "education" author "John Doe"
# Store image with metadata
LANCE STORE images IMAGE "base64_encoded_image..." filename "photo.jpg" tags "nature,landscape"
# Store multimodal content
LANCE STORE content TEXT "Product description" IMAGE "base64_image..." type "product"
Search Operations
# Search with raw vector
LANCE SEARCH documents VECTOR "0.1,0.2,0.3,0.4" K 5
# Semantic text search
LANCE SEARCH.TEXT documents "artificial intelligence" K 10 NPROBES 20
# Generate embeddings
LANCE EMBED.TEXT "Hello world" "Machine learning"
Index Management
# Create performance index
LANCE CREATE.INDEX documents IVF_PQ PARTITIONS 256 SUBVECTORS 16
# Drop dataset
LANCE DROP old_dataset
Configuration
Setting Up Embedding Service
# Configure embedding service URL
redis-cli HSET config:core:aiembed url "http://your-embedding-service:8080/embed"
# Optional: Set authentication token
redis-cli HSET config:core:aiembed token "your-api-token"
Embedding Service API
Your embedding service should accept POST requests:
{
"texts": ["text1", "text2"],
"images": ["base64_image1", "base64_image2"],
"model": "your-model-name"
}
And return responses:
{
"embeddings": [[0.1, 0.2, ...], [0.3, 0.4, ...]],
"model": "model-name",
"usage": {"tokens": 100, "requests": 2}
}
Interactive Features
The demo script includes:
- Colored output for better readability
- Step-by-step execution with explanations
- Error handling demonstrations
- Automatic cleanup options
- Performance testing concepts
- Real-world usage examples
Use Cases Demonstrated
-
Document Search System
- Semantic document retrieval
- Metadata filtering
- Relevance ranking
-
Image Similarity Search
- Visual content matching
- Tag-based filtering
- Multimodal queries
-
Product Recommendations
- Feature-based similarity
- Category filtering
- Price range queries
-
Content Management
- Mixed media storage
- Cross-modal search
- Rich metadata support
Tantivy Search Demo (Bash Script)
Overview
The tantivy_search_demo.sh
script provides a comprehensive demonstration of HeroDB's search functionality using Redis commands. It showcases various search scenarios including basic text search, filtering, sorting, geographic queries, and more.
Prerequisites
- HeroDB Server: The server must be running on port 6381
- Redis CLI: The
redis-cli
tool must be installed and available in your PATH
Running the Demo
Step 1: Start HeroDB Server
# From the project root directory
cargo run -- --port 6381
Step 2: Run the Demo (in a new terminal)
# From the project root directory
./examples/tantivy_search_demo.sh
What the Demo Covers
The script demonstrates 15 different search scenarios:
- Index Creation - Creating a search index with various field types
- Data Insertion - Adding sample products to the index
- Basic Text Search - Simple keyword searches
- Filtered Search - Combining text search with category filters
- Numeric Range Search - Finding products within price ranges
- Sorting Results - Ordering results by different fields
- Limited Results - Pagination and result limiting
- Complex Queries - Multi-field searches with sorting
- Geographic Search - Location-based queries
- Index Information - Getting statistics about the search index
- Search Comparison - Tantivy vs simple pattern matching
- Fuzzy Search - Typo tolerance and approximate matching
- Phrase Search - Exact phrase matching
- Boolean Queries - AND, OR, NOT operators
- Cleanup - Removing test data
Sample Data
The demo uses a product catalog with the following fields:
- title (TEXT) - Product name with higher search weight
- description (TEXT) - Detailed product description
- category (TAG) - Comma-separated categories
- price (NUMERIC) - Product price for range queries
- rating (NUMERIC) - Customer rating for sorting
- location (GEO) - Geographic coordinates for location searches
Key Redis Commands Demonstrated
Index Management
# Create search index
FT.CREATE product_catalog ON HASH PREFIX 1 product: SCHEMA title TEXT WEIGHT 2.0 SORTABLE description TEXT category TAG SEPARATOR , price NUMERIC SORTABLE rating NUMERIC SORTABLE location GEO
# Get index information
FT.INFO product_catalog
# Drop index
FT.DROPINDEX product_catalog
Search Queries
# Basic text search
FT.SEARCH product_catalog wireless
# Filtered search
FT.SEARCH product_catalog 'organic @category:{food}'
# Numeric range
FT.SEARCH product_catalog '@price:[50 150]'
# Sorted results
FT.SEARCH product_catalog '@category:{electronics}' SORTBY price ASC
# Geographic search
FT.SEARCH product_catalog '@location:[37.7749 -122.4194 50 km]'
# Boolean queries
FT.SEARCH product_catalog 'wireless AND audio'
FT.SEARCH product_catalog 'coffee OR tea'
# Phrase search
FT.SEARCH product_catalog '"noise canceling"'
Interactive Features
The demo script includes:
- Colored output for better readability
- Pause between steps to review results
- Error handling with clear error messages
- Automatic cleanup of test data
- Progress indicators showing what each step demonstrates
Troubleshooting
HeroDB Not Running
✗ HeroDB is not running on port 6381
ℹ Please start HeroDB with: cargo run -- --port 6381
Solution: Start the HeroDB server in a separate terminal.
Redis CLI Not Found
redis-cli: command not found
Solution: Install Redis tools or use an alternative Redis client.
Connection Refused
Could not connect to Redis at localhost:6381: Connection refused
Solution: Ensure HeroDB is running and listening on the correct port.
Manual Testing
You can also run individual commands manually:
# Connect to HeroDB
redis-cli -h localhost -p 6381
# Create a simple index
FT.CREATE myindex ON HASH SCHEMA title TEXT description TEXT
# Add a document
HSET doc:1 title "Hello World" description "This is a test document"
# Search
FT.SEARCH myindex hello
Performance Notes
- Indexing: Documents are indexed in real-time as they're added
- Search Speed: Full-text search is much faster than pattern matching on large datasets
- Memory Usage: Tantivy indexes are memory-efficient and disk-backed
- Scalability: Supports millions of documents with sub-second search times
Advanced Features
The demo showcases advanced Tantivy features:
- Relevance Scoring - Results ranked by relevance
- Fuzzy Matching - Handles typos and approximate matches
- Field Weighting - Title field has higher search weight
- Multi-field Search - Search across multiple fields simultaneously
- Geographic Queries - Distance-based location searches
- Numeric Ranges - Efficient range queries on numeric fields
- Tag Filtering - Fast categorical filtering
Next Steps
After running the demo, explore:
- Custom Schemas - Define your own field types and configurations
- Large Datasets - Test with thousands or millions of documents
- Real Applications - Integrate search into your applications
- Performance Tuning - Optimize for your specific use case
For more information, see the search documentation.