Compare commits
9 Commits
vector
...
management
Author | SHA1 | Date | |
---|---|---|---|
|
bdf363016a | ||
|
8798bc202e | ||
|
9fa9832605 | ||
|
4bb24b38dd | ||
|
f3da14b957 | ||
|
5ea34b4445 | ||
|
d9a3b711d1 | ||
|
d931770e90 | ||
|
a87ec4dbb5 |
5301
Cargo.lock
generated
5301
Cargo.lock
generated
File diff suppressed because it is too large
Load Diff
17
Cargo.toml
17
Cargo.toml
@@ -1,8 +1,8 @@
|
|||||||
[package]
|
[package]
|
||||||
name = "herodb"
|
name = "herodb"
|
||||||
version = "0.0.1"
|
version = "0.0.1"
|
||||||
authors = ["Pin Fang <fpfangpin@hotmail.com>"]
|
authors = ["ThreeFold Tech NV"]
|
||||||
edition = "2021"
|
edition = "2024"
|
||||||
|
|
||||||
[dependencies]
|
[dependencies]
|
||||||
anyhow = "1.0.59"
|
anyhow = "1.0.59"
|
||||||
@@ -24,18 +24,7 @@ age = "0.10"
|
|||||||
secrecy = "0.8"
|
secrecy = "0.8"
|
||||||
ed25519-dalek = "2"
|
ed25519-dalek = "2"
|
||||||
base64 = "0.22"
|
base64 = "0.22"
|
||||||
# Lance vector database dependencies
|
jsonrpsee = { version = "0.26.0", features = ["http-client", "ws-client", "server", "macros"] }
|
||||||
lance = "0.33"
|
|
||||||
lance-index = "0.33"
|
|
||||||
lance-linalg = "0.33"
|
|
||||||
# Use Arrow version compatible with Lance 0.33
|
|
||||||
arrow = "55.2"
|
|
||||||
arrow-array = "55.2"
|
|
||||||
arrow-schema = "55.2"
|
|
||||||
parquet = "55.2"
|
|
||||||
uuid = { version = "1.10", features = ["v4"] }
|
|
||||||
reqwest = { version = "0.11", features = ["json"] }
|
|
||||||
image = "0.25"
|
|
||||||
|
|
||||||
[dev-dependencies]
|
[dev-dependencies]
|
||||||
redis = { version = "0.24", features = ["aio", "tokio-comp"] }
|
redis = { version = "0.24", features = ["aio", "tokio-comp"] }
|
||||||
|
@@ -47,13 +47,13 @@ You can start HeroDB with different backends and encryption options:
|
|||||||
#### `redb` with Encryption
|
#### `redb` with Encryption
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
./target/release/herodb --dir /tmp/herodb_encrypted --port 6379 --encrypt --key mysecretkey
|
./target/release/herodb --dir /tmp/herodb_encrypted --port 6379 --encrypt --encryption_key mysecretkey
|
||||||
```
|
```
|
||||||
|
|
||||||
#### `sled` with Encryption
|
#### `sled` with Encryption
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
./target/release/herodb --dir /tmp/herodb_sled_encrypted --port 6379 --sled --encrypt --key mysecretkey
|
./target/release/herodb --dir /tmp/herodb_sled_encrypted --port 6379 --sled --encrypt --encryption_key mysecretkey
|
||||||
```
|
```
|
||||||
|
|
||||||
## Usage with Redis Clients
|
## Usage with Redis Clients
|
||||||
|
@@ -1,454 +0,0 @@
|
|||||||
# Lance Vector Database Operations
|
|
||||||
|
|
||||||
HeroDB includes a powerful vector database integration using Lance, enabling high-performance vector storage, search, and multimodal data management. By default, it uses Ollama for local text embeddings, with support for custom external embedding services.
|
|
||||||
|
|
||||||
## Overview
|
|
||||||
|
|
||||||
The Lance vector database integration provides:
|
|
||||||
|
|
||||||
- **High-performance vector storage** using Lance's columnar format
|
|
||||||
- **Local Ollama integration** for text embeddings (default, no external dependencies)
|
|
||||||
- **Custom embedding service support** for advanced use cases
|
|
||||||
- **Text embedding support** (images via custom services)
|
|
||||||
- **Vector similarity search** with configurable parameters
|
|
||||||
- **Scalable indexing** with IVF_PQ (Inverted File with Product Quantization)
|
|
||||||
- **Redis-compatible command interface**
|
|
||||||
|
|
||||||
## Architecture
|
|
||||||
|
|
||||||
```
|
|
||||||
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
|
|
||||||
│ HeroDB │ │ External │ │ Lance │
|
|
||||||
│ Redis Server │◄──►│ Embedding │ │ Vector Store │
|
|
||||||
│ │ │ Service │ │ │
|
|
||||||
└─────────────────┘ └──────────────────┘ └─────────────────┘
|
|
||||||
│ │ │
|
|
||||||
│ │ │
|
|
||||||
Redis Protocol HTTP API Arrow/Parquet
|
|
||||||
Commands JSON Requests Columnar Storage
|
|
||||||
```
|
|
||||||
|
|
||||||
### Key Components
|
|
||||||
|
|
||||||
1. **Lance Store**: High-performance columnar vector storage
|
|
||||||
2. **Ollama Integration**: Local embedding service (default)
|
|
||||||
3. **Custom Embedding Service**: Optional HTTP API for advanced use cases
|
|
||||||
4. **Redis Command Interface**: Familiar Redis-style commands
|
|
||||||
5. **Arrow Schema**: Flexible schema definition for metadata
|
|
||||||
|
|
||||||
## Configuration
|
|
||||||
|
|
||||||
### Default Setup (Ollama)
|
|
||||||
|
|
||||||
HeroDB uses Ollama by default for text embeddings. No configuration is required if Ollama is running locally:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Install Ollama (if not already installed)
|
|
||||||
# Visit: https://ollama.ai
|
|
||||||
|
|
||||||
# Pull the embedding model
|
|
||||||
ollama pull nomic-embed-text
|
|
||||||
|
|
||||||
# Ollama automatically runs on localhost:11434
|
|
||||||
# HeroDB will use this by default
|
|
||||||
```
|
|
||||||
|
|
||||||
**Default Configuration:**
|
|
||||||
- **URL**: `http://localhost:11434`
|
|
||||||
- **Model**: `nomic-embed-text`
|
|
||||||
- **Dimensions**: 768 (for nomic-embed-text)
|
|
||||||
|
|
||||||
### Custom Embedding Service (Optional)
|
|
||||||
|
|
||||||
To use a custom embedding service instead of Ollama:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Set custom embedding service URL
|
|
||||||
redis-cli HSET config:core:aiembed url "http://your-embedding-service:8080/embed"
|
|
||||||
|
|
||||||
# Optional: Set authentication if required
|
|
||||||
redis-cli HSET config:core:aiembed token "your-api-token"
|
|
||||||
```
|
|
||||||
|
|
||||||
### Embedding Service API Contracts
|
|
||||||
|
|
||||||
#### Ollama API (Default)
|
|
||||||
HeroDB calls Ollama using this format:
|
|
||||||
|
|
||||||
```bash
|
|
||||||
POST http://localhost:11434/api/embeddings
|
|
||||||
Content-Type: application/json
|
|
||||||
|
|
||||||
{
|
|
||||||
"model": "nomic-embed-text",
|
|
||||||
"prompt": "Your text to embed"
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
Response:
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"embedding": [0.1, 0.2, 0.3, ...]
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
#### Custom Service API
|
|
||||||
Your custom embedding service should accept POST requests with this JSON format:
|
|
||||||
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"texts": ["text1", "text2"], // Optional: array of texts
|
|
||||||
"images": ["base64_image1", "base64_image2"], // Optional: base64 encoded images
|
|
||||||
"model": "your-model-name" // Optional: model specification
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
And return responses in this format:
|
|
||||||
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"embeddings": [[0.1, 0.2, ...], [0.3, 0.4, ...]], // Array of embedding vectors
|
|
||||||
"model": "model-name", // Model used
|
|
||||||
"usage": { // Optional usage stats
|
|
||||||
"tokens": 100,
|
|
||||||
"requests": 2
|
|
||||||
}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
## Commands Reference
|
|
||||||
|
|
||||||
### Dataset Management
|
|
||||||
|
|
||||||
#### LANCE CREATE
|
|
||||||
Create a new vector dataset with specified dimensions and optional schema.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
LANCE CREATE <dataset> DIM <dimension> [SCHEMA field:type ...]
|
|
||||||
```
|
|
||||||
|
|
||||||
**Parameters:**
|
|
||||||
- `dataset`: Name of the dataset
|
|
||||||
- `dimension`: Vector dimension (e.g., 384, 768, 1536)
|
|
||||||
- `field:type`: Optional metadata fields (string, int, float, bool)
|
|
||||||
|
|
||||||
**Examples:**
|
|
||||||
```bash
|
|
||||||
# Create a simple dataset for 384-dimensional vectors
|
|
||||||
LANCE CREATE documents DIM 384
|
|
||||||
|
|
||||||
# Create dataset with metadata schema
|
|
||||||
LANCE CREATE products DIM 768 SCHEMA category:string price:float available:bool
|
|
||||||
```
|
|
||||||
|
|
||||||
#### LANCE LIST
|
|
||||||
List all available datasets.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
LANCE LIST
|
|
||||||
```
|
|
||||||
|
|
||||||
**Returns:** Array of dataset names
|
|
||||||
|
|
||||||
#### LANCE INFO
|
|
||||||
Get information about a specific dataset.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
LANCE INFO <dataset>
|
|
||||||
```
|
|
||||||
|
|
||||||
**Returns:** Dataset metadata including name, version, row count, and schema
|
|
||||||
|
|
||||||
#### LANCE DROP
|
|
||||||
Delete a dataset and all its data.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
LANCE DROP <dataset>
|
|
||||||
```
|
|
||||||
|
|
||||||
### Data Operations
|
|
||||||
|
|
||||||
#### LANCE STORE
|
|
||||||
Store multimodal data (text/images) with automatic embedding generation.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
LANCE STORE <dataset> [TEXT <text>] [IMAGE <base64>] [key value ...]
|
|
||||||
```
|
|
||||||
|
|
||||||
**Parameters:**
|
|
||||||
- `dataset`: Target dataset name
|
|
||||||
- `TEXT`: Text content to embed
|
|
||||||
- `IMAGE`: Base64-encoded image to embed
|
|
||||||
- `key value`: Metadata key-value pairs
|
|
||||||
|
|
||||||
**Examples:**
|
|
||||||
```bash
|
|
||||||
# Store text with metadata
|
|
||||||
LANCE STORE documents TEXT "Machine learning is transforming industries" category "AI" author "John Doe"
|
|
||||||
|
|
||||||
# Store image with metadata
|
|
||||||
LANCE STORE images IMAGE "iVBORw0KGgoAAAANSUhEUgAA..." category "nature" tags "landscape,mountains"
|
|
||||||
|
|
||||||
# Store both text and image
|
|
||||||
LANCE STORE multimodal TEXT "Beautiful sunset" IMAGE "base64data..." location "California"
|
|
||||||
```
|
|
||||||
|
|
||||||
**Returns:** Unique ID of the stored item
|
|
||||||
|
|
||||||
### Search Operations
|
|
||||||
|
|
||||||
#### LANCE SEARCH
|
|
||||||
Search using a raw vector.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
LANCE SEARCH <dataset> VECTOR <vector> K <k> [NPROBES <n>] [REFINE <r>]
|
|
||||||
```
|
|
||||||
|
|
||||||
**Parameters:**
|
|
||||||
- `dataset`: Dataset to search
|
|
||||||
- `vector`: Comma-separated vector values (e.g., "0.1,0.2,0.3")
|
|
||||||
- `k`: Number of results to return
|
|
||||||
- `NPROBES`: Number of partitions to search (optional)
|
|
||||||
- `REFINE`: Refine factor for better accuracy (optional)
|
|
||||||
|
|
||||||
**Example:**
|
|
||||||
```bash
|
|
||||||
LANCE SEARCH documents VECTOR "0.1,0.2,0.3,0.4" K 5 NPROBES 10
|
|
||||||
```
|
|
||||||
|
|
||||||
#### LANCE SEARCH.TEXT
|
|
||||||
Search using text query (automatically embedded).
|
|
||||||
|
|
||||||
```bash
|
|
||||||
LANCE SEARCH.TEXT <dataset> <query_text> K <k> [NPROBES <n>] [REFINE <r>]
|
|
||||||
```
|
|
||||||
|
|
||||||
**Parameters:**
|
|
||||||
- `dataset`: Dataset to search
|
|
||||||
- `query_text`: Text query to search for
|
|
||||||
- `k`: Number of results to return
|
|
||||||
- `NPROBES`: Number of partitions to search (optional)
|
|
||||||
- `REFINE`: Refine factor for better accuracy (optional)
|
|
||||||
|
|
||||||
**Example:**
|
|
||||||
```bash
|
|
||||||
LANCE SEARCH.TEXT documents "artificial intelligence applications" K 10 NPROBES 20
|
|
||||||
```
|
|
||||||
|
|
||||||
**Returns:** Array of results with distance scores and metadata
|
|
||||||
|
|
||||||
### Embedding Operations
|
|
||||||
|
|
||||||
#### LANCE EMBED.TEXT
|
|
||||||
Generate embeddings for text without storing.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
LANCE EMBED.TEXT <text1> [text2] [text3] ...
|
|
||||||
```
|
|
||||||
|
|
||||||
**Example:**
|
|
||||||
```bash
|
|
||||||
LANCE EMBED.TEXT "Hello world" "Machine learning" "Vector database"
|
|
||||||
```
|
|
||||||
|
|
||||||
**Returns:** Array of embedding vectors
|
|
||||||
|
|
||||||
### Index Management
|
|
||||||
|
|
||||||
#### LANCE CREATE.INDEX
|
|
||||||
Create a vector index for faster search performance.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
LANCE CREATE.INDEX <dataset> <index_type> [PARTITIONS <n>] [SUBVECTORS <n>]
|
|
||||||
```
|
|
||||||
|
|
||||||
**Parameters:**
|
|
||||||
- `dataset`: Dataset to index
|
|
||||||
- `index_type`: Index type (currently supports "IVF_PQ")
|
|
||||||
- `PARTITIONS`: Number of partitions (default: 256)
|
|
||||||
- `SUBVECTORS`: Number of sub-vectors for PQ (default: 16)
|
|
||||||
|
|
||||||
**Example:**
|
|
||||||
```bash
|
|
||||||
LANCE CREATE.INDEX documents IVF_PQ PARTITIONS 512 SUBVECTORS 32
|
|
||||||
```
|
|
||||||
|
|
||||||
## Usage Patterns
|
|
||||||
|
|
||||||
### 1. Document Search System
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Setup
|
|
||||||
LANCE CREATE documents DIM 384 SCHEMA title:string content:string category:string
|
|
||||||
|
|
||||||
# Store documents
|
|
||||||
LANCE STORE documents TEXT "Introduction to machine learning algorithms" title "ML Basics" category "education"
|
|
||||||
LANCE STORE documents TEXT "Deep learning neural networks explained" title "Deep Learning" category "education"
|
|
||||||
LANCE STORE documents TEXT "Building scalable web applications" title "Web Dev" category "programming"
|
|
||||||
|
|
||||||
# Create index for better performance
|
|
||||||
LANCE CREATE.INDEX documents IVF_PQ PARTITIONS 256
|
|
||||||
|
|
||||||
# Search
|
|
||||||
LANCE SEARCH.TEXT documents "neural networks" K 5
|
|
||||||
```
|
|
||||||
|
|
||||||
### 2. Image Similarity Search
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Setup
|
|
||||||
LANCE CREATE images DIM 512 SCHEMA filename:string tags:string
|
|
||||||
|
|
||||||
# Store images (base64 encoded)
|
|
||||||
LANCE STORE images IMAGE "iVBORw0KGgoAAAANSUhEUgAA..." filename "sunset.jpg" tags "nature,landscape"
|
|
||||||
LANCE STORE images IMAGE "iVBORw0KGgoAAAANSUhEUgBB..." filename "city.jpg" tags "urban,architecture"
|
|
||||||
|
|
||||||
# Search by image
|
|
||||||
LANCE STORE temp_search IMAGE "query_image_base64..."
|
|
||||||
# Then use the returned ID to get embedding and search
|
|
||||||
```
|
|
||||||
|
|
||||||
### 3. Multimodal Content Management
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Setup
|
|
||||||
LANCE CREATE content DIM 768 SCHEMA type:string source:string
|
|
||||||
|
|
||||||
# Store mixed content
|
|
||||||
LANCE STORE content TEXT "Product description for smartphone" type "product" source "catalog"
|
|
||||||
LANCE STORE content IMAGE "product_image_base64..." type "product_image" source "catalog"
|
|
||||||
|
|
||||||
# Search across all content types
|
|
||||||
LANCE SEARCH.TEXT content "smartphone features" K 10
|
|
||||||
```
|
|
||||||
|
|
||||||
## Performance Considerations
|
|
||||||
|
|
||||||
### Vector Dimensions
|
|
||||||
- **384**: Good for general text (e.g., sentence-transformers)
|
|
||||||
- **768**: Standard for BERT-like models
|
|
||||||
- **1536**: OpenAI text-embedding-ada-002
|
|
||||||
- **Higher dimensions**: Better accuracy but slower search
|
|
||||||
|
|
||||||
### Index Configuration
|
|
||||||
- **More partitions**: Better for larger datasets (>100K vectors)
|
|
||||||
- **More sub-vectors**: Better compression but slower search
|
|
||||||
- **NPROBES**: Higher values = better accuracy, slower search
|
|
||||||
|
|
||||||
### Best Practices
|
|
||||||
|
|
||||||
1. **Create indexes** for datasets with >1000 vectors
|
|
||||||
2. **Use appropriate dimensions** based on your embedding model
|
|
||||||
3. **Configure NPROBES** based on accuracy vs speed requirements
|
|
||||||
4. **Batch operations** when possible for better performance
|
|
||||||
5. **Monitor embedding service** response times and rate limits
|
|
||||||
|
|
||||||
## Error Handling
|
|
||||||
|
|
||||||
Common error scenarios and solutions:
|
|
||||||
|
|
||||||
### Embedding Service Errors
|
|
||||||
```bash
|
|
||||||
# Error: Embedding service not configured
|
|
||||||
ERR Embedding service URL not configured. Set it with: HSET config:core:aiembed url <YOUR_EMBEDDING_SERVICE_URL>
|
|
||||||
|
|
||||||
# Error: Service unavailable
|
|
||||||
ERR Embedding service returned error 404 Not Found
|
|
||||||
```
|
|
||||||
|
|
||||||
**Solution:** Ensure embedding service is running and URL is correct.
|
|
||||||
|
|
||||||
### Dataset Errors
|
|
||||||
```bash
|
|
||||||
# Error: Dataset doesn't exist
|
|
||||||
ERR Dataset 'mydata' does not exist
|
|
||||||
|
|
||||||
# Error: Dimension mismatch
|
|
||||||
ERR Vector dimension mismatch: expected 384, got 768
|
|
||||||
```
|
|
||||||
|
|
||||||
**Solution:** Create dataset first or check vector dimensions.
|
|
||||||
|
|
||||||
### Search Errors
|
|
||||||
```bash
|
|
||||||
# Error: Invalid vector format
|
|
||||||
ERR Invalid vector format
|
|
||||||
|
|
||||||
# Error: No index available
|
|
||||||
ERR No index available for fast search
|
|
||||||
```
|
|
||||||
|
|
||||||
**Solution:** Check vector format or create an index.
|
|
||||||
|
|
||||||
## Integration Examples
|
|
||||||
|
|
||||||
### With Python
|
|
||||||
```python
|
|
||||||
import redis
|
|
||||||
import json
|
|
||||||
|
|
||||||
r = redis.Redis(host='localhost', port=6379)
|
|
||||||
|
|
||||||
# Create dataset
|
|
||||||
r.execute_command('LANCE', 'CREATE', 'docs', 'DIM', '384')
|
|
||||||
|
|
||||||
# Store document
|
|
||||||
result = r.execute_command('LANCE', 'STORE', 'docs',
|
|
||||||
'TEXT', 'Machine learning tutorial',
|
|
||||||
'category', 'education')
|
|
||||||
print(f"Stored with ID: {result}")
|
|
||||||
|
|
||||||
# Search
|
|
||||||
results = r.execute_command('LANCE', 'SEARCH.TEXT', 'docs',
|
|
||||||
'machine learning', 'K', '5')
|
|
||||||
print(f"Search results: {results}")
|
|
||||||
```
|
|
||||||
|
|
||||||
### With Node.js
|
|
||||||
```javascript
|
|
||||||
const redis = require('redis');
|
|
||||||
const client = redis.createClient();
|
|
||||||
|
|
||||||
// Create dataset
|
|
||||||
await client.sendCommand(['LANCE', 'CREATE', 'docs', 'DIM', '384']);
|
|
||||||
|
|
||||||
// Store document
|
|
||||||
const id = await client.sendCommand(['LANCE', 'STORE', 'docs',
|
|
||||||
'TEXT', 'Deep learning guide',
|
|
||||||
'category', 'AI']);
|
|
||||||
|
|
||||||
// Search
|
|
||||||
const results = await client.sendCommand(['LANCE', 'SEARCH.TEXT', 'docs',
|
|
||||||
'deep learning', 'K', '10']);
|
|
||||||
```
|
|
||||||
|
|
||||||
## Monitoring and Maintenance
|
|
||||||
|
|
||||||
### Health Checks
|
|
||||||
```bash
|
|
||||||
# Check if Lance store is available
|
|
||||||
LANCE LIST
|
|
||||||
|
|
||||||
# Check dataset health
|
|
||||||
LANCE INFO mydataset
|
|
||||||
|
|
||||||
# Test embedding service
|
|
||||||
LANCE EMBED.TEXT "test"
|
|
||||||
```
|
|
||||||
|
|
||||||
### Maintenance Operations
|
|
||||||
```bash
|
|
||||||
# Backup: Use standard Redis backup procedures
|
|
||||||
# The Lance data is stored separately in the data directory
|
|
||||||
|
|
||||||
# Cleanup: Remove unused datasets
|
|
||||||
LANCE DROP old_dataset
|
|
||||||
|
|
||||||
# Reindex: Drop and recreate indexes if needed
|
|
||||||
LANCE DROP dataset_name
|
|
||||||
LANCE CREATE dataset_name DIM 384
|
|
||||||
# Re-import data
|
|
||||||
LANCE CREATE.INDEX dataset_name IVF_PQ
|
|
||||||
```
|
|
||||||
|
|
||||||
This integration provides a powerful foundation for building AI-powered applications with vector search capabilities while maintaining the familiar Redis interface.
|
|
@@ -1,191 +1,6 @@
|
|||||||
# HeroDB Examples
|
# HeroDB Tantivy Search Examples
|
||||||
|
|
||||||
This directory contains examples demonstrating HeroDB's capabilities including full-text search powered by Tantivy and vector database operations using Lance.
|
This directory contains examples demonstrating HeroDB's full-text search capabilities powered by Tantivy.
|
||||||
|
|
||||||
## Available Examples
|
|
||||||
|
|
||||||
1. **[Tantivy Search Demo](#tantivy-search-demo-bash-script)** - Full-text search capabilities
|
|
||||||
2. **[Lance Vector Database Demo](#lance-vector-database-demo-bash-script)** - Vector database and AI operations
|
|
||||||
3. **[AGE Encryption Demo](age_bash_demo.sh)** - Cryptographic operations
|
|
||||||
4. **[Simple Demo](simple_demo.sh)** - Basic Redis operations
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Lance Vector Database Demo (Bash Script)
|
|
||||||
|
|
||||||
### Overview
|
|
||||||
The `lance_vector_demo.sh` script provides a comprehensive demonstration of HeroDB's vector database capabilities using Lance. It showcases vector storage, similarity search, multimodal data handling, and AI-powered operations with external embedding services.
|
|
||||||
|
|
||||||
### Prerequisites
|
|
||||||
1. **HeroDB Server**: The server must be running (default port 6379)
|
|
||||||
2. **Redis CLI**: The `redis-cli` tool must be installed and available in your PATH
|
|
||||||
3. **Embedding Service** (optional): For full functionality, set up an external embedding service
|
|
||||||
|
|
||||||
### Running the Demo
|
|
||||||
|
|
||||||
#### Step 1: Start HeroDB Server
|
|
||||||
```bash
|
|
||||||
# From the project root directory
|
|
||||||
cargo run -- --dir ./test_data --port 6379
|
|
||||||
```
|
|
||||||
|
|
||||||
#### Step 2: Run the Demo (in a new terminal)
|
|
||||||
```bash
|
|
||||||
# From the project root directory
|
|
||||||
./examples/lance_vector_demo.sh
|
|
||||||
```
|
|
||||||
|
|
||||||
### What the Demo Covers
|
|
||||||
|
|
||||||
The script demonstrates comprehensive vector database operations:
|
|
||||||
|
|
||||||
1. **Dataset Management**
|
|
||||||
- Creating vector datasets with custom dimensions
|
|
||||||
- Defining schemas with metadata fields
|
|
||||||
- Listing and inspecting datasets
|
|
||||||
- Dataset information and statistics
|
|
||||||
|
|
||||||
2. **Embedding Operations**
|
|
||||||
- Text embedding generation via external services
|
|
||||||
- Multimodal embedding support (text + images)
|
|
||||||
- Batch embedding operations
|
|
||||||
|
|
||||||
3. **Data Storage**
|
|
||||||
- Storing text documents with automatic embedding
|
|
||||||
- Storing images with metadata
|
|
||||||
- Multimodal content storage
|
|
||||||
- Rich metadata support
|
|
||||||
|
|
||||||
4. **Vector Search**
|
|
||||||
- Similarity search with raw vectors
|
|
||||||
- Text-based semantic search
|
|
||||||
- Configurable search parameters (K, NPROBES, REFINE)
|
|
||||||
- Cross-modal search capabilities
|
|
||||||
|
|
||||||
5. **Index Management**
|
|
||||||
- Creating IVF_PQ indexes for performance
|
|
||||||
- Custom index parameters
|
|
||||||
- Performance optimization
|
|
||||||
|
|
||||||
6. **Advanced Features**
|
|
||||||
- Error handling and recovery
|
|
||||||
- Performance testing concepts
|
|
||||||
- Monitoring and maintenance
|
|
||||||
- Cleanup operations
|
|
||||||
|
|
||||||
### Key Lance Commands Demonstrated
|
|
||||||
|
|
||||||
#### Dataset Management
|
|
||||||
```bash
|
|
||||||
# Create vector dataset
|
|
||||||
LANCE CREATE documents DIM 384
|
|
||||||
|
|
||||||
# Create dataset with schema
|
|
||||||
LANCE CREATE products DIM 768 SCHEMA category:string price:float available:bool
|
|
||||||
|
|
||||||
# List datasets
|
|
||||||
LANCE LIST
|
|
||||||
|
|
||||||
# Get dataset information
|
|
||||||
LANCE INFO documents
|
|
||||||
```
|
|
||||||
|
|
||||||
#### Data Operations
|
|
||||||
```bash
|
|
||||||
# Store text with metadata
|
|
||||||
LANCE STORE documents TEXT "Machine learning tutorial" category "education" author "John Doe"
|
|
||||||
|
|
||||||
# Store image with metadata
|
|
||||||
LANCE STORE images IMAGE "base64_encoded_image..." filename "photo.jpg" tags "nature,landscape"
|
|
||||||
|
|
||||||
# Store multimodal content
|
|
||||||
LANCE STORE content TEXT "Product description" IMAGE "base64_image..." type "product"
|
|
||||||
```
|
|
||||||
|
|
||||||
#### Search Operations
|
|
||||||
```bash
|
|
||||||
# Search with raw vector
|
|
||||||
LANCE SEARCH documents VECTOR "0.1,0.2,0.3,0.4" K 5
|
|
||||||
|
|
||||||
# Semantic text search
|
|
||||||
LANCE SEARCH.TEXT documents "artificial intelligence" K 10 NPROBES 20
|
|
||||||
|
|
||||||
# Generate embeddings
|
|
||||||
LANCE EMBED.TEXT "Hello world" "Machine learning"
|
|
||||||
```
|
|
||||||
|
|
||||||
#### Index Management
|
|
||||||
```bash
|
|
||||||
# Create performance index
|
|
||||||
LANCE CREATE.INDEX documents IVF_PQ PARTITIONS 256 SUBVECTORS 16
|
|
||||||
|
|
||||||
# Drop dataset
|
|
||||||
LANCE DROP old_dataset
|
|
||||||
```
|
|
||||||
|
|
||||||
### Configuration
|
|
||||||
|
|
||||||
#### Setting Up Embedding Service
|
|
||||||
```bash
|
|
||||||
# Configure embedding service URL
|
|
||||||
redis-cli HSET config:core:aiembed url "http://your-embedding-service:8080/embed"
|
|
||||||
|
|
||||||
# Optional: Set authentication token
|
|
||||||
redis-cli HSET config:core:aiembed token "your-api-token"
|
|
||||||
```
|
|
||||||
|
|
||||||
#### Embedding Service API
|
|
||||||
Your embedding service should accept POST requests:
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"texts": ["text1", "text2"],
|
|
||||||
"images": ["base64_image1", "base64_image2"],
|
|
||||||
"model": "your-model-name"
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
And return responses:
|
|
||||||
```json
|
|
||||||
{
|
|
||||||
"embeddings": [[0.1, 0.2, ...], [0.3, 0.4, ...]],
|
|
||||||
"model": "model-name",
|
|
||||||
"usage": {"tokens": 100, "requests": 2}
|
|
||||||
}
|
|
||||||
```
|
|
||||||
|
|
||||||
### Interactive Features
|
|
||||||
|
|
||||||
The demo script includes:
|
|
||||||
- **Colored output** for better readability
|
|
||||||
- **Step-by-step execution** with explanations
|
|
||||||
- **Error handling** demonstrations
|
|
||||||
- **Automatic cleanup** options
|
|
||||||
- **Performance testing** concepts
|
|
||||||
- **Real-world usage** examples
|
|
||||||
|
|
||||||
### Use Cases Demonstrated
|
|
||||||
|
|
||||||
1. **Document Search System**
|
|
||||||
- Semantic document retrieval
|
|
||||||
- Metadata filtering
|
|
||||||
- Relevance ranking
|
|
||||||
|
|
||||||
2. **Image Similarity Search**
|
|
||||||
- Visual content matching
|
|
||||||
- Tag-based filtering
|
|
||||||
- Multimodal queries
|
|
||||||
|
|
||||||
3. **Product Recommendations**
|
|
||||||
- Feature-based similarity
|
|
||||||
- Category filtering
|
|
||||||
- Price range queries
|
|
||||||
|
|
||||||
4. **Content Management**
|
|
||||||
- Mixed media storage
|
|
||||||
- Cross-modal search
|
|
||||||
- Rich metadata support
|
|
||||||
|
|
||||||
---
|
|
||||||
|
|
||||||
## Tantivy Search Demo (Bash Script)
|
## Tantivy Search Demo (Bash Script)
|
||||||
|
|
||||||
|
@@ -1,426 +0,0 @@
|
|||||||
#!/bin/bash
|
|
||||||
|
|
||||||
# Lance Vector Database Demo Script
|
|
||||||
# This script demonstrates all Lance vector database operations in HeroDB
|
|
||||||
|
|
||||||
set -e # Exit on any error
|
|
||||||
|
|
||||||
# Configuration
|
|
||||||
REDIS_HOST="localhost"
|
|
||||||
REDIS_PORT="6379"
|
|
||||||
REDIS_CLI="redis-cli -h $REDIS_HOST -p $REDIS_PORT"
|
|
||||||
|
|
||||||
# Colors for output
|
|
||||||
RED='\033[0;31m'
|
|
||||||
GREEN='\033[0;32m'
|
|
||||||
YELLOW='\033[1;33m'
|
|
||||||
BLUE='\033[0;34m'
|
|
||||||
NC='\033[0m' # No Color
|
|
||||||
|
|
||||||
# Helper functions
|
|
||||||
log_info() {
|
|
||||||
echo -e "${BLUE}[INFO]${NC} $1"
|
|
||||||
}
|
|
||||||
|
|
||||||
log_success() {
|
|
||||||
echo -e "${GREEN}[SUCCESS]${NC} $1"
|
|
||||||
}
|
|
||||||
|
|
||||||
log_warning() {
|
|
||||||
echo -e "${YELLOW}[WARNING]${NC} $1"
|
|
||||||
}
|
|
||||||
|
|
||||||
log_error() {
|
|
||||||
echo -e "${RED}[ERROR]${NC} $1"
|
|
||||||
}
|
|
||||||
|
|
||||||
execute_command() {
|
|
||||||
local cmd="$1"
|
|
||||||
local description="$2"
|
|
||||||
|
|
||||||
echo
|
|
||||||
log_info "Executing: $description"
|
|
||||||
echo "Command: $cmd"
|
|
||||||
|
|
||||||
if result=$($cmd 2>&1); then
|
|
||||||
log_success "Result: $result"
|
|
||||||
else
|
|
||||||
log_error "Failed: $result"
|
|
||||||
return 1
|
|
||||||
fi
|
|
||||||
}
|
|
||||||
|
|
||||||
# Check if HeroDB is running
|
|
||||||
check_herodb() {
|
|
||||||
log_info "Checking if HeroDB is running..."
|
|
||||||
if ! $REDIS_CLI ping > /dev/null 2>&1; then
|
|
||||||
log_error "HeroDB is not running. Please start it first:"
|
|
||||||
echo " cargo run -- --dir ./test_data --port $REDIS_PORT"
|
|
||||||
exit 1
|
|
||||||
fi
|
|
||||||
log_success "HeroDB is running"
|
|
||||||
}
|
|
||||||
|
|
||||||
# Setup embedding service configuration
|
|
||||||
setup_embedding_service() {
|
|
||||||
log_info "Setting up embedding service configuration..."
|
|
||||||
|
|
||||||
# Note: This is a mock URL for demonstration
|
|
||||||
# In production, replace with your actual embedding service
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI HSET config:core:aiembed url 'http://localhost:8080/embed'" \
|
|
||||||
"Configure embedding service URL"
|
|
||||||
|
|
||||||
# Optional: Set authentication token
|
|
||||||
# execute_command \
|
|
||||||
# "$REDIS_CLI HSET config:core:aiembed token 'your-api-token'" \
|
|
||||||
# "Configure embedding service token"
|
|
||||||
|
|
||||||
log_warning "Note: Embedding service at http://localhost:8080/embed is not running."
|
|
||||||
log_warning "Some operations will fail, but this demonstrates the command structure."
|
|
||||||
}
|
|
||||||
|
|
||||||
# Dataset Management Operations
|
|
||||||
demo_dataset_management() {
|
|
||||||
echo
|
|
||||||
echo "=========================================="
|
|
||||||
echo " DATASET MANAGEMENT DEMO"
|
|
||||||
echo "=========================================="
|
|
||||||
|
|
||||||
# List datasets (should be empty initially)
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE LIST" \
|
|
||||||
"List all datasets (initially empty)"
|
|
||||||
|
|
||||||
# Create a simple dataset
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE CREATE documents DIM 384" \
|
|
||||||
"Create a simple document dataset with 384 dimensions"
|
|
||||||
|
|
||||||
# Create a dataset with schema
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE CREATE products DIM 768 SCHEMA category:string price:float available:bool description:string" \
|
|
||||||
"Create products dataset with custom schema"
|
|
||||||
|
|
||||||
# Create an image dataset
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE CREATE images DIM 512 SCHEMA filename:string tags:string width:int height:int" \
|
|
||||||
"Create images dataset for multimodal content"
|
|
||||||
|
|
||||||
# List datasets again
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE LIST" \
|
|
||||||
"List all datasets (should show 3 datasets)"
|
|
||||||
|
|
||||||
# Get info about datasets
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE INFO documents" \
|
|
||||||
"Get information about documents dataset"
|
|
||||||
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE INFO products" \
|
|
||||||
"Get information about products dataset"
|
|
||||||
}
|
|
||||||
|
|
||||||
# Embedding Operations
|
|
||||||
demo_embedding_operations() {
|
|
||||||
echo
|
|
||||||
echo "=========================================="
|
|
||||||
echo " EMBEDDING OPERATIONS DEMO"
|
|
||||||
echo "=========================================="
|
|
||||||
|
|
||||||
log_warning "The following operations will fail because no embedding service is running."
|
|
||||||
log_warning "This demonstrates the command structure and error handling."
|
|
||||||
|
|
||||||
# Try to embed text (will fail without embedding service)
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE EMBED.TEXT 'Hello world'" \
|
|
||||||
"Generate embedding for single text" || true
|
|
||||||
|
|
||||||
# Try to embed multiple texts
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE EMBED.TEXT 'Machine learning' 'Artificial intelligence' 'Deep learning'" \
|
|
||||||
"Generate embeddings for multiple texts" || true
|
|
||||||
}
|
|
||||||
|
|
||||||
# Data Storage Operations
|
|
||||||
demo_data_storage() {
|
|
||||||
echo
|
|
||||||
echo "=========================================="
|
|
||||||
echo " DATA STORAGE DEMO"
|
|
||||||
echo "=========================================="
|
|
||||||
|
|
||||||
log_warning "Storage operations will fail without embedding service, but show command structure."
|
|
||||||
|
|
||||||
# Store text documents
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE STORE documents TEXT 'Introduction to machine learning algorithms and their applications in modern AI systems' category 'education' author 'John Doe' difficulty 'beginner'" \
|
|
||||||
"Store a document with text and metadata" || true
|
|
||||||
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE STORE documents TEXT 'Deep learning neural networks for computer vision tasks' category 'research' author 'Jane Smith' difficulty 'advanced'" \
|
|
||||||
"Store another document" || true
|
|
||||||
|
|
||||||
# Store product information
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE STORE products TEXT 'High-performance laptop with 16GB RAM and SSD storage' category 'electronics' price '1299.99' available 'true'" \
|
|
||||||
"Store product with text description" || true
|
|
||||||
|
|
||||||
# Store image with metadata (using placeholder base64)
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE STORE images IMAGE 'iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mP8/5+hHgAHggJ/PchI7wAAAABJRU5ErkJggg==' filename 'sample.png' tags 'test,demo' width '1' height '1'" \
|
|
||||||
"Store image with metadata (1x1 pixel PNG)" || true
|
|
||||||
|
|
||||||
# Store multimodal content
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE STORE images TEXT 'Beautiful sunset over mountains' IMAGE 'iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mP8/5+hHgAHggJ/PchI7wAAAABJRU5ErkJggg==' filename 'sunset.png' tags 'nature,landscape' location 'California'" \
|
|
||||||
"Store multimodal content (text + image)" || true
|
|
||||||
}
|
|
||||||
|
|
||||||
# Search Operations
|
|
||||||
demo_search_operations() {
|
|
||||||
echo
|
|
||||||
echo "=========================================="
|
|
||||||
echo " SEARCH OPERATIONS DEMO"
|
|
||||||
echo "=========================================="
|
|
||||||
|
|
||||||
log_warning "Search operations will fail without data, but show command structure."
|
|
||||||
|
|
||||||
# Search with raw vector
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE SEARCH documents VECTOR '0.1,0.2,0.3,0.4,0.5' K 5" \
|
|
||||||
"Search with raw vector (5 results)" || true
|
|
||||||
|
|
||||||
# Search with vector and parameters
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE SEARCH documents VECTOR '0.1,0.2,0.3,0.4,0.5' K 10 NPROBES 20 REFINE 2" \
|
|
||||||
"Search with vector and advanced parameters" || true
|
|
||||||
|
|
||||||
# Text-based search
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE SEARCH.TEXT documents 'machine learning algorithms' K 5" \
|
|
||||||
"Search using text query" || true
|
|
||||||
|
|
||||||
# Text search with parameters
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE SEARCH.TEXT products 'laptop computer' K 3 NPROBES 10" \
|
|
||||||
"Search products using text with parameters" || true
|
|
||||||
|
|
||||||
# Search in image dataset
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE SEARCH.TEXT images 'sunset landscape' K 5" \
|
|
||||||
"Search images using text description" || true
|
|
||||||
}
|
|
||||||
|
|
||||||
# Index Management Operations
|
|
||||||
demo_index_management() {
|
|
||||||
echo
|
|
||||||
echo "=========================================="
|
|
||||||
echo " INDEX MANAGEMENT DEMO"
|
|
||||||
echo "=========================================="
|
|
||||||
|
|
||||||
# Create indexes for better search performance
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE CREATE.INDEX documents IVF_PQ" \
|
|
||||||
"Create default IVF_PQ index for documents"
|
|
||||||
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE CREATE.INDEX products IVF_PQ PARTITIONS 512 SUBVECTORS 32" \
|
|
||||||
"Create IVF_PQ index with custom parameters for products"
|
|
||||||
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE CREATE.INDEX images IVF_PQ PARTITIONS 256 SUBVECTORS 16" \
|
|
||||||
"Create IVF_PQ index for images dataset"
|
|
||||||
|
|
||||||
log_success "Indexes created successfully"
|
|
||||||
}
|
|
||||||
|
|
||||||
# Advanced Usage Examples
|
|
||||||
demo_advanced_usage() {
|
|
||||||
echo
|
|
||||||
echo "=========================================="
|
|
||||||
echo " ADVANCED USAGE EXAMPLES"
|
|
||||||
echo "=========================================="
|
|
||||||
|
|
||||||
# Create a specialized dataset for semantic search
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE CREATE semantic_search DIM 1536 SCHEMA title:string content:string url:string timestamp:string source:string" \
|
|
||||||
"Create dataset for semantic search with rich metadata"
|
|
||||||
|
|
||||||
# Demonstrate batch operations concept
|
|
||||||
log_info "Batch operations example (would store multiple items):"
|
|
||||||
echo " for doc in documents:"
|
|
||||||
echo " LANCE STORE semantic_search TEXT \"\$doc_content\" title \"\$title\" url \"\$url\""
|
|
||||||
|
|
||||||
# Show monitoring commands
|
|
||||||
log_info "Monitoring and maintenance commands:"
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE LIST" \
|
|
||||||
"List all datasets for monitoring"
|
|
||||||
|
|
||||||
# Show dataset statistics
|
|
||||||
for dataset in documents products images semantic_search; do
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE INFO $dataset" \
|
|
||||||
"Get statistics for $dataset" || true
|
|
||||||
done
|
|
||||||
}
|
|
||||||
|
|
||||||
# Cleanup Operations
|
|
||||||
demo_cleanup() {
|
|
||||||
echo
|
|
||||||
echo "=========================================="
|
|
||||||
echo " CLEANUP OPERATIONS DEMO"
|
|
||||||
echo "=========================================="
|
|
||||||
|
|
||||||
log_info "Demonstrating cleanup operations..."
|
|
||||||
|
|
||||||
# Drop individual datasets
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE DROP semantic_search" \
|
|
||||||
"Drop semantic_search dataset"
|
|
||||||
|
|
||||||
# List remaining datasets
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE LIST" \
|
|
||||||
"List remaining datasets"
|
|
||||||
|
|
||||||
# Ask user if they want to clean up all test data
|
|
||||||
echo
|
|
||||||
read -p "Do you want to clean up all test datasets? (y/N): " -n 1 -r
|
|
||||||
echo
|
|
||||||
if [[ $REPLY =~ ^[Yy]$ ]]; then
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE DROP documents" \
|
|
||||||
"Drop documents dataset"
|
|
||||||
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE DROP products" \
|
|
||||||
"Drop products dataset"
|
|
||||||
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE DROP images" \
|
|
||||||
"Drop images dataset"
|
|
||||||
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE LIST" \
|
|
||||||
"Verify all datasets are cleaned up"
|
|
||||||
|
|
||||||
log_success "All test datasets cleaned up"
|
|
||||||
else
|
|
||||||
log_info "Keeping test datasets for further experimentation"
|
|
||||||
fi
|
|
||||||
}
|
|
||||||
|
|
||||||
# Error Handling Demo
|
|
||||||
demo_error_handling() {
|
|
||||||
echo
|
|
||||||
echo "=========================================="
|
|
||||||
echo " ERROR HANDLING DEMO"
|
|
||||||
echo "=========================================="
|
|
||||||
|
|
||||||
log_info "Demonstrating various error conditions..."
|
|
||||||
|
|
||||||
# Try to access non-existent dataset
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE INFO nonexistent_dataset" \
|
|
||||||
"Try to get info for non-existent dataset" || true
|
|
||||||
|
|
||||||
# Try to search non-existent dataset
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE SEARCH nonexistent_dataset VECTOR '0.1,0.2' K 5" \
|
|
||||||
"Try to search non-existent dataset" || true
|
|
||||||
|
|
||||||
# Try to drop non-existent dataset
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE DROP nonexistent_dataset" \
|
|
||||||
"Try to drop non-existent dataset" || true
|
|
||||||
|
|
||||||
# Try invalid vector format
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE SEARCH documents VECTOR 'invalid,vector,format' K 5" \
|
|
||||||
"Try search with invalid vector format" || true
|
|
||||||
|
|
||||||
log_info "Error handling demonstration complete"
|
|
||||||
}
|
|
||||||
|
|
||||||
# Performance Testing Demo
|
|
||||||
demo_performance_testing() {
|
|
||||||
echo
|
|
||||||
echo "=========================================="
|
|
||||||
echo " PERFORMANCE TESTING DEMO"
|
|
||||||
echo "=========================================="
|
|
||||||
|
|
||||||
log_info "Creating performance test dataset..."
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE CREATE perf_test DIM 128 SCHEMA batch_id:string item_id:string" \
|
|
||||||
"Create performance test dataset"
|
|
||||||
|
|
||||||
log_info "Performance testing would involve:"
|
|
||||||
echo " 1. Bulk loading thousands of vectors"
|
|
||||||
echo " 2. Creating indexes with different parameters"
|
|
||||||
echo " 3. Measuring search latency with various K values"
|
|
||||||
echo " 4. Testing different NPROBES settings"
|
|
||||||
echo " 5. Monitoring memory usage"
|
|
||||||
|
|
||||||
log_info "Example performance test commands:"
|
|
||||||
echo " # Test search speed with different parameters"
|
|
||||||
echo " time redis-cli LANCE SEARCH.TEXT perf_test 'query' K 10"
|
|
||||||
echo " time redis-cli LANCE SEARCH.TEXT perf_test 'query' K 10 NPROBES 50"
|
|
||||||
echo " time redis-cli LANCE SEARCH.TEXT perf_test 'query' K 100 NPROBES 100"
|
|
||||||
|
|
||||||
# Clean up performance test dataset
|
|
||||||
execute_command \
|
|
||||||
"$REDIS_CLI LANCE DROP perf_test" \
|
|
||||||
"Clean up performance test dataset"
|
|
||||||
}
|
|
||||||
|
|
||||||
# Main execution
|
|
||||||
main() {
|
|
||||||
echo "=========================================="
|
|
||||||
echo " LANCE VECTOR DATABASE DEMO SCRIPT"
|
|
||||||
echo "=========================================="
|
|
||||||
echo
|
|
||||||
echo "This script demonstrates all Lance vector database operations."
|
|
||||||
echo "Note: Some operations will fail without a running embedding service."
|
|
||||||
echo "This is expected and demonstrates error handling."
|
|
||||||
echo
|
|
||||||
|
|
||||||
# Check prerequisites
|
|
||||||
check_herodb
|
|
||||||
|
|
||||||
# Setup
|
|
||||||
setup_embedding_service
|
|
||||||
|
|
||||||
# Run demos
|
|
||||||
demo_dataset_management
|
|
||||||
demo_embedding_operations
|
|
||||||
demo_data_storage
|
|
||||||
demo_search_operations
|
|
||||||
demo_index_management
|
|
||||||
demo_advanced_usage
|
|
||||||
demo_error_handling
|
|
||||||
demo_performance_testing
|
|
||||||
|
|
||||||
# Cleanup
|
|
||||||
demo_cleanup
|
|
||||||
|
|
||||||
echo
|
|
||||||
echo "=========================================="
|
|
||||||
echo " DEMO COMPLETE"
|
|
||||||
echo "=========================================="
|
|
||||||
echo
|
|
||||||
log_success "Lance vector database demo completed successfully!"
|
|
||||||
echo
|
|
||||||
echo "Next steps:"
|
|
||||||
echo "1. Set up a real embedding service (OpenAI, Hugging Face, etc.)"
|
|
||||||
echo "2. Update the embedding service URL configuration"
|
|
||||||
echo "3. Try storing and searching real data"
|
|
||||||
echo "4. Experiment with different vector dimensions and index parameters"
|
|
||||||
echo "5. Build your AI-powered application!"
|
|
||||||
echo
|
|
||||||
echo "For more information, see docs/lance_vector_db.md"
|
|
||||||
}
|
|
||||||
|
|
||||||
# Run the demo
|
|
||||||
main "$@"
|
|
603
src/cmd.rs
603
src/cmd.rs
@@ -1,14 +1,12 @@
|
|||||||
use crate::{error::DBError, protocol::Protocol, server::Server};
|
use crate::{error::DBError, protocol::Protocol, server::Server};
|
||||||
use tokio::time::{timeout, Duration};
|
use tokio::time::{timeout, Duration};
|
||||||
use futures::future::select_all;
|
use futures::future::select_all;
|
||||||
use std::sync::Arc;
|
|
||||||
use base64::Engine;
|
|
||||||
|
|
||||||
#[derive(Debug, Clone)]
|
#[derive(Debug, Clone)]
|
||||||
pub enum Cmd {
|
pub enum Cmd {
|
||||||
Ping,
|
Ping,
|
||||||
Echo(String),
|
Echo(String),
|
||||||
Select(u64), // Changed from u16 to u64
|
Select(u64, Option<String>), // db_index, optional_key
|
||||||
Get(String),
|
Get(String),
|
||||||
Set(String, String),
|
Set(String, String),
|
||||||
SetPx(String, String, u128),
|
SetPx(String, String, u128),
|
||||||
@@ -86,49 +84,6 @@ pub enum Cmd {
|
|||||||
AgeSignName(String, String), // name, message
|
AgeSignName(String, String), // name, message
|
||||||
AgeVerifyName(String, String, String), // name, message, signature_b64
|
AgeVerifyName(String, String, String), // name, message, signature_b64
|
||||||
AgeList,
|
AgeList,
|
||||||
|
|
||||||
// Lance vector database commands
|
|
||||||
LanceCreate {
|
|
||||||
dataset: String,
|
|
||||||
dim: usize,
|
|
||||||
schema: Vec<(String, String)>, // field_name, field_type pairs
|
|
||||||
},
|
|
||||||
LanceStore {
|
|
||||||
dataset: String,
|
|
||||||
text: Option<String>,
|
|
||||||
image_base64: Option<String>,
|
|
||||||
metadata: std::collections::HashMap<String, String>,
|
|
||||||
},
|
|
||||||
LanceSearch {
|
|
||||||
dataset: String,
|
|
||||||
vector: Vec<f32>,
|
|
||||||
k: usize,
|
|
||||||
nprobes: Option<usize>,
|
|
||||||
refine_factor: Option<usize>,
|
|
||||||
},
|
|
||||||
LanceSearchText {
|
|
||||||
dataset: String,
|
|
||||||
query_text: String,
|
|
||||||
k: usize,
|
|
||||||
nprobes: Option<usize>,
|
|
||||||
refine_factor: Option<usize>,
|
|
||||||
},
|
|
||||||
LanceEmbedText {
|
|
||||||
texts: Vec<String>,
|
|
||||||
},
|
|
||||||
LanceCreateIndex {
|
|
||||||
dataset: String,
|
|
||||||
index_type: String,
|
|
||||||
num_partitions: Option<usize>,
|
|
||||||
num_sub_vectors: Option<usize>,
|
|
||||||
},
|
|
||||||
LanceList,
|
|
||||||
LanceDrop {
|
|
||||||
dataset: String,
|
|
||||||
},
|
|
||||||
LanceInfo {
|
|
||||||
dataset: String,
|
|
||||||
},
|
|
||||||
}
|
}
|
||||||
|
|
||||||
impl Cmd {
|
impl Cmd {
|
||||||
@@ -143,11 +98,18 @@ impl Cmd {
|
|||||||
Ok((
|
Ok((
|
||||||
match cmd[0].to_lowercase().as_str() {
|
match cmd[0].to_lowercase().as_str() {
|
||||||
"select" => {
|
"select" => {
|
||||||
if cmd.len() != 2 {
|
if cmd.len() < 2 || cmd.len() > 4 {
|
||||||
return Err(DBError("wrong number of arguments for SELECT".to_string()));
|
return Err(DBError("wrong number of arguments for SELECT".to_string()));
|
||||||
}
|
}
|
||||||
let idx = cmd[1].parse::<u64>().map_err(|_| DBError("ERR DB index is not an integer".to_string()))?;
|
let idx = cmd[1].parse::<u64>().map_err(|_| DBError("ERR DB index is not an integer".to_string()))?;
|
||||||
Cmd::Select(idx)
|
let key = if cmd.len() == 4 && cmd[2].to_lowercase() == "key" {
|
||||||
|
Some(cmd[3].clone())
|
||||||
|
} else if cmd.len() == 2 {
|
||||||
|
None
|
||||||
|
} else {
|
||||||
|
return Err(DBError("ERR syntax error".to_string()));
|
||||||
|
};
|
||||||
|
Cmd::Select(idx, key)
|
||||||
}
|
}
|
||||||
"echo" => Cmd::Echo(cmd[1].clone()),
|
"echo" => Cmd::Echo(cmd[1].clone()),
|
||||||
"ping" => Cmd::Ping,
|
"ping" => Cmd::Ping,
|
||||||
@@ -661,237 +623,6 @@ impl Cmd {
|
|||||||
_ => return Err(DBError(format!("unsupported AGE subcommand {:?}", cmd))),
|
_ => return Err(DBError(format!("unsupported AGE subcommand {:?}", cmd))),
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
"lance" => {
|
|
||||||
if cmd.len() < 2 {
|
|
||||||
return Err(DBError("wrong number of arguments for LANCE".to_string()));
|
|
||||||
}
|
|
||||||
match cmd[1].to_lowercase().as_str() {
|
|
||||||
"create" => {
|
|
||||||
if cmd.len() < 4 {
|
|
||||||
return Err(DBError("LANCE CREATE <dataset> DIM <dimension> [SCHEMA field:type ...]".to_string()));
|
|
||||||
}
|
|
||||||
let dataset = cmd[2].clone();
|
|
||||||
|
|
||||||
// Parse DIM parameter
|
|
||||||
if cmd[3].to_lowercase() != "dim" {
|
|
||||||
return Err(DBError("Expected DIM after dataset name".to_string()));
|
|
||||||
}
|
|
||||||
if cmd.len() < 5 {
|
|
||||||
return Err(DBError("Missing dimension value".to_string()));
|
|
||||||
}
|
|
||||||
let dim = cmd[4].parse::<usize>().map_err(|_| DBError("Invalid dimension value".to_string()))?;
|
|
||||||
|
|
||||||
// Parse optional SCHEMA
|
|
||||||
let mut schema = Vec::new();
|
|
||||||
let mut i = 5;
|
|
||||||
if i < cmd.len() && cmd[i].to_lowercase() == "schema" {
|
|
||||||
i += 1;
|
|
||||||
while i < cmd.len() {
|
|
||||||
let field_spec = &cmd[i];
|
|
||||||
let parts: Vec<&str> = field_spec.split(':').collect();
|
|
||||||
if parts.len() != 2 {
|
|
||||||
return Err(DBError("Schema fields must be in format field:type".to_string()));
|
|
||||||
}
|
|
||||||
schema.push((parts[0].to_string(), parts[1].to_string()));
|
|
||||||
i += 1;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
Cmd::LanceCreate { dataset, dim, schema }
|
|
||||||
}
|
|
||||||
"store" => {
|
|
||||||
if cmd.len() < 3 {
|
|
||||||
return Err(DBError("LANCE STORE <dataset> [TEXT <text>] [IMAGE <base64>] [metadata...]".to_string()));
|
|
||||||
}
|
|
||||||
let dataset = cmd[2].clone();
|
|
||||||
let mut text = None;
|
|
||||||
let mut image_base64 = None;
|
|
||||||
let mut metadata = std::collections::HashMap::new();
|
|
||||||
|
|
||||||
let mut i = 3;
|
|
||||||
while i < cmd.len() {
|
|
||||||
match cmd[i].to_lowercase().as_str() {
|
|
||||||
"text" => {
|
|
||||||
if i + 1 >= cmd.len() {
|
|
||||||
return Err(DBError("TEXT requires a value".to_string()));
|
|
||||||
}
|
|
||||||
text = Some(cmd[i + 1].clone());
|
|
||||||
i += 2;
|
|
||||||
}
|
|
||||||
"image" => {
|
|
||||||
if i + 1 >= cmd.len() {
|
|
||||||
return Err(DBError("IMAGE requires a base64 value".to_string()));
|
|
||||||
}
|
|
||||||
image_base64 = Some(cmd[i + 1].clone());
|
|
||||||
i += 2;
|
|
||||||
}
|
|
||||||
_ => {
|
|
||||||
// Parse as metadata key:value
|
|
||||||
if i + 1 >= cmd.len() {
|
|
||||||
return Err(DBError("Metadata requires key value pairs".to_string()));
|
|
||||||
}
|
|
||||||
metadata.insert(cmd[i].clone(), cmd[i + 1].clone());
|
|
||||||
i += 2;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
Cmd::LanceStore { dataset, text, image_base64, metadata }
|
|
||||||
}
|
|
||||||
"search" => {
|
|
||||||
if cmd.len() < 5 {
|
|
||||||
return Err(DBError("LANCE SEARCH <dataset> VECTOR <vector> K <k> [NPROBES <n>] [REFINE <r>]".to_string()));
|
|
||||||
}
|
|
||||||
let dataset = cmd[2].clone();
|
|
||||||
|
|
||||||
if cmd[3].to_lowercase() != "vector" {
|
|
||||||
return Err(DBError("Expected VECTOR after dataset name".to_string()));
|
|
||||||
}
|
|
||||||
|
|
||||||
// Parse vector - expect comma-separated floats in brackets or just comma-separated
|
|
||||||
let vector_str = &cmd[4];
|
|
||||||
let vector_str = vector_str.trim_start_matches('[').trim_end_matches(']');
|
|
||||||
let vector: Result<Vec<f32>, _> = vector_str
|
|
||||||
.split(',')
|
|
||||||
.map(|s| s.trim().parse::<f32>())
|
|
||||||
.collect();
|
|
||||||
let vector = vector.map_err(|_| DBError("Invalid vector format".to_string()))?;
|
|
||||||
|
|
||||||
if cmd.len() < 7 || cmd[5].to_lowercase() != "k" {
|
|
||||||
return Err(DBError("Expected K after vector".to_string()));
|
|
||||||
}
|
|
||||||
let k = cmd[6].parse::<usize>().map_err(|_| DBError("Invalid K value".to_string()))?;
|
|
||||||
|
|
||||||
let mut nprobes = None;
|
|
||||||
let mut refine_factor = None;
|
|
||||||
let mut i = 7;
|
|
||||||
while i < cmd.len() {
|
|
||||||
match cmd[i].to_lowercase().as_str() {
|
|
||||||
"nprobes" => {
|
|
||||||
if i + 1 >= cmd.len() {
|
|
||||||
return Err(DBError("NPROBES requires a value".to_string()));
|
|
||||||
}
|
|
||||||
nprobes = Some(cmd[i + 1].parse::<usize>().map_err(|_| DBError("Invalid NPROBES value".to_string()))?);
|
|
||||||
i += 2;
|
|
||||||
}
|
|
||||||
"refine" => {
|
|
||||||
if i + 1 >= cmd.len() {
|
|
||||||
return Err(DBError("REFINE requires a value".to_string()));
|
|
||||||
}
|
|
||||||
refine_factor = Some(cmd[i + 1].parse::<usize>().map_err(|_| DBError("Invalid REFINE value".to_string()))?);
|
|
||||||
i += 2;
|
|
||||||
}
|
|
||||||
_ => {
|
|
||||||
return Err(DBError(format!("Unknown parameter: {}", cmd[i])));
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
Cmd::LanceSearch { dataset, vector, k, nprobes, refine_factor }
|
|
||||||
}
|
|
||||||
"search.text" => {
|
|
||||||
if cmd.len() < 6 {
|
|
||||||
return Err(DBError("LANCE SEARCH.TEXT <dataset> <query_text> K <k> [NPROBES <n>] [REFINE <r>]".to_string()));
|
|
||||||
}
|
|
||||||
let dataset = cmd[2].clone();
|
|
||||||
let query_text = cmd[3].clone();
|
|
||||||
|
|
||||||
if cmd[4].to_lowercase() != "k" {
|
|
||||||
return Err(DBError("Expected K after query text".to_string()));
|
|
||||||
}
|
|
||||||
let k = cmd[5].parse::<usize>().map_err(|_| DBError("Invalid K value".to_string()))?;
|
|
||||||
|
|
||||||
let mut nprobes = None;
|
|
||||||
let mut refine_factor = None;
|
|
||||||
let mut i = 6;
|
|
||||||
while i < cmd.len() {
|
|
||||||
match cmd[i].to_lowercase().as_str() {
|
|
||||||
"nprobes" => {
|
|
||||||
if i + 1 >= cmd.len() {
|
|
||||||
return Err(DBError("NPROBES requires a value".to_string()));
|
|
||||||
}
|
|
||||||
nprobes = Some(cmd[i + 1].parse::<usize>().map_err(|_| DBError("Invalid NPROBES value".to_string()))?);
|
|
||||||
i += 2;
|
|
||||||
}
|
|
||||||
"refine" => {
|
|
||||||
if i + 1 >= cmd.len() {
|
|
||||||
return Err(DBError("REFINE requires a value".to_string()));
|
|
||||||
}
|
|
||||||
refine_factor = Some(cmd[i + 1].parse::<usize>().map_err(|_| DBError("Invalid REFINE value".to_string()))?);
|
|
||||||
i += 2;
|
|
||||||
}
|
|
||||||
_ => {
|
|
||||||
return Err(DBError(format!("Unknown parameter: {}", cmd[i])));
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
Cmd::LanceSearchText { dataset, query_text, k, nprobes, refine_factor }
|
|
||||||
}
|
|
||||||
"embed.text" => {
|
|
||||||
if cmd.len() < 3 {
|
|
||||||
return Err(DBError("LANCE EMBED.TEXT <text1> [text2] ...".to_string()));
|
|
||||||
}
|
|
||||||
let texts = cmd[2..].to_vec();
|
|
||||||
Cmd::LanceEmbedText { texts }
|
|
||||||
}
|
|
||||||
"create.index" => {
|
|
||||||
if cmd.len() < 5 {
|
|
||||||
return Err(DBError("LANCE CREATE.INDEX <dataset> <index_type> [PARTITIONS <n>] [SUBVECTORS <n>]".to_string()));
|
|
||||||
}
|
|
||||||
let dataset = cmd[2].clone();
|
|
||||||
let index_type = cmd[3].clone();
|
|
||||||
|
|
||||||
let mut num_partitions = None;
|
|
||||||
let mut num_sub_vectors = None;
|
|
||||||
let mut i = 4;
|
|
||||||
while i < cmd.len() {
|
|
||||||
match cmd[i].to_lowercase().as_str() {
|
|
||||||
"partitions" => {
|
|
||||||
if i + 1 >= cmd.len() {
|
|
||||||
return Err(DBError("PARTITIONS requires a value".to_string()));
|
|
||||||
}
|
|
||||||
num_partitions = Some(cmd[i + 1].parse::<usize>().map_err(|_| DBError("Invalid PARTITIONS value".to_string()))?);
|
|
||||||
i += 2;
|
|
||||||
}
|
|
||||||
"subvectors" => {
|
|
||||||
if i + 1 >= cmd.len() {
|
|
||||||
return Err(DBError("SUBVECTORS requires a value".to_string()));
|
|
||||||
}
|
|
||||||
num_sub_vectors = Some(cmd[i + 1].parse::<usize>().map_err(|_| DBError("Invalid SUBVECTORS value".to_string()))?);
|
|
||||||
i += 2;
|
|
||||||
}
|
|
||||||
_ => {
|
|
||||||
return Err(DBError(format!("Unknown parameter: {}", cmd[i])));
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
Cmd::LanceCreateIndex { dataset, index_type, num_partitions, num_sub_vectors }
|
|
||||||
}
|
|
||||||
"list" => {
|
|
||||||
if cmd.len() != 2 {
|
|
||||||
return Err(DBError("LANCE LIST takes no arguments".to_string()));
|
|
||||||
}
|
|
||||||
Cmd::LanceList
|
|
||||||
}
|
|
||||||
"drop" => {
|
|
||||||
if cmd.len() != 3 {
|
|
||||||
return Err(DBError("LANCE DROP <dataset>".to_string()));
|
|
||||||
}
|
|
||||||
let dataset = cmd[2].clone();
|
|
||||||
Cmd::LanceDrop { dataset }
|
|
||||||
}
|
|
||||||
"info" => {
|
|
||||||
if cmd.len() != 3 {
|
|
||||||
return Err(DBError("LANCE INFO <dataset>".to_string()));
|
|
||||||
}
|
|
||||||
let dataset = cmd[2].clone();
|
|
||||||
Cmd::LanceInfo { dataset }
|
|
||||||
}
|
|
||||||
_ => return Err(DBError(format!("unsupported LANCE subcommand {:?}", cmd))),
|
|
||||||
}
|
|
||||||
}
|
|
||||||
_ => Cmd::Unknow(cmd[0].clone()),
|
_ => Cmd::Unknow(cmd[0].clone()),
|
||||||
},
|
},
|
||||||
protocol,
|
protocol,
|
||||||
@@ -918,7 +649,7 @@ impl Cmd {
|
|||||||
}
|
}
|
||||||
|
|
||||||
match self {
|
match self {
|
||||||
Cmd::Select(db) => select_cmd(server, db).await,
|
Cmd::Select(db, key) => select_cmd(server, db, key).await,
|
||||||
Cmd::Ping => Ok(Protocol::SimpleString("PONG".to_string())),
|
Cmd::Ping => Ok(Protocol::SimpleString("PONG".to_string())),
|
||||||
Cmd::Echo(s) => Ok(Protocol::BulkString(s)),
|
Cmd::Echo(s) => Ok(Protocol::BulkString(s)),
|
||||||
Cmd::Get(k) => get_cmd(server, &k).await,
|
Cmd::Get(k) => get_cmd(server, &k).await,
|
||||||
@@ -1006,25 +737,20 @@ impl Cmd {
|
|||||||
Cmd::AgeSignName(name, message) => Ok(crate::age::cmd_age_sign_name(server, &name, &message).await),
|
Cmd::AgeSignName(name, message) => Ok(crate::age::cmd_age_sign_name(server, &name, &message).await),
|
||||||
Cmd::AgeVerifyName(name, message, sig_b64) => Ok(crate::age::cmd_age_verify_name(server, &name, &message, &sig_b64).await),
|
Cmd::AgeVerifyName(name, message, sig_b64) => Ok(crate::age::cmd_age_verify_name(server, &name, &message, &sig_b64).await),
|
||||||
Cmd::AgeList => Ok(crate::age::cmd_age_list(server).await),
|
Cmd::AgeList => Ok(crate::age::cmd_age_list(server).await),
|
||||||
|
|
||||||
// Lance vector database commands
|
|
||||||
Cmd::LanceCreate { dataset, dim, schema } => lance_create_cmd(server, &dataset, dim, &schema).await,
|
|
||||||
Cmd::LanceStore { dataset, text, image_base64, metadata } => lance_store_cmd(server, &dataset, text.as_deref(), image_base64.as_deref(), &metadata).await,
|
|
||||||
Cmd::LanceSearch { dataset, vector, k, nprobes, refine_factor } => lance_search_cmd(server, &dataset, &vector, k, nprobes, refine_factor).await,
|
|
||||||
Cmd::LanceSearchText { dataset, query_text, k, nprobes, refine_factor } => lance_search_text_cmd(server, &dataset, &query_text, k, nprobes, refine_factor).await,
|
|
||||||
Cmd::LanceEmbedText { texts } => lance_embed_text_cmd(server, &texts).await,
|
|
||||||
Cmd::LanceCreateIndex { dataset, index_type, num_partitions, num_sub_vectors } => lance_create_index_cmd(server, &dataset, &index_type, num_partitions, num_sub_vectors).await,
|
|
||||||
Cmd::LanceList => lance_list_cmd(server).await,
|
|
||||||
Cmd::LanceDrop { dataset } => lance_drop_cmd(server, &dataset).await,
|
|
||||||
Cmd::LanceInfo { dataset } => lance_info_cmd(server, &dataset).await,
|
|
||||||
|
|
||||||
Cmd::Unknow(s) => Ok(Protocol::err(&format!("ERR unknown command `{}`", s))),
|
Cmd::Unknow(s) => Ok(Protocol::err(&format!("ERR unknown command `{}`", s))),
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
pub fn to_protocol(self) -> Protocol {
|
pub fn to_protocol(self) -> Protocol {
|
||||||
match self {
|
match self {
|
||||||
Cmd::Select(db) => Protocol::Array(vec![Protocol::BulkString("select".to_string()), Protocol::BulkString(db.to_string())]),
|
Cmd::Select(db, key) => {
|
||||||
|
let mut arr = vec![Protocol::BulkString("select".to_string()), Protocol::BulkString(db.to_string())];
|
||||||
|
if let Some(k) = key {
|
||||||
|
arr.push(Protocol::BulkString("key".to_string()));
|
||||||
|
arr.push(Protocol::BulkString(k));
|
||||||
|
}
|
||||||
|
Protocol::Array(arr)
|
||||||
|
}
|
||||||
Cmd::Ping => Protocol::Array(vec![Protocol::BulkString("ping".to_string())]),
|
Cmd::Ping => Protocol::Array(vec![Protocol::BulkString("ping".to_string())]),
|
||||||
Cmd::Echo(s) => Protocol::Array(vec![Protocol::BulkString("echo".to_string()), Protocol::BulkString(s)]),
|
Cmd::Echo(s) => Protocol::Array(vec![Protocol::BulkString("echo".to_string()), Protocol::BulkString(s)]),
|
||||||
Cmd::Get(k) => Protocol::Array(vec![Protocol::BulkString("get".to_string()), Protocol::BulkString(k)]),
|
Cmd::Get(k) => Protocol::Array(vec![Protocol::BulkString("get".to_string()), Protocol::BulkString(k)]),
|
||||||
@@ -1041,9 +767,44 @@ async fn flushdb_cmd(server: &mut Server) -> Result<Protocol, DBError> {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
async fn select_cmd(server: &mut Server, db: u64) -> Result<Protocol, DBError> {
|
async fn select_cmd(server: &mut Server, db: u64, key: Option<String>) -> Result<Protocol, DBError> {
|
||||||
// Test if we can access the database (this will create it if needed)
|
// Load database metadata
|
||||||
|
let meta = match crate::rpc::RpcServerImpl::load_meta_static(&server.option.dir, db).await {
|
||||||
|
Ok(m) => m,
|
||||||
|
Err(_) => {
|
||||||
|
// If meta doesn't exist, create default
|
||||||
|
let default_meta = crate::rpc::DatabaseMeta {
|
||||||
|
public: true,
|
||||||
|
keys: std::collections::HashMap::new(),
|
||||||
|
};
|
||||||
|
if let Err(_) = crate::rpc::RpcServerImpl::save_meta_static(&server.option.dir, db, &default_meta).await {
|
||||||
|
return Ok(Protocol::err("ERR failed to initialize database metadata"));
|
||||||
|
}
|
||||||
|
default_meta
|
||||||
|
}
|
||||||
|
};
|
||||||
|
|
||||||
|
// Check access permissions
|
||||||
|
let permissions = if meta.public {
|
||||||
|
// Public database - full access
|
||||||
|
Some(crate::rpc::Permissions::ReadWrite)
|
||||||
|
} else if let Some(key_str) = key {
|
||||||
|
// Private database - check key
|
||||||
|
let hash = crate::rpc::hash_key(&key_str);
|
||||||
|
if let Some(access_key) = meta.keys.get(&hash) {
|
||||||
|
Some(access_key.permissions.clone())
|
||||||
|
} else {
|
||||||
|
return Ok(Protocol::err("ERR invalid access key"));
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
return Ok(Protocol::err("ERR access key required for private database"));
|
||||||
|
};
|
||||||
|
|
||||||
|
// Set selected database and permissions
|
||||||
server.selected_db = db;
|
server.selected_db = db;
|
||||||
|
server.current_permissions = permissions;
|
||||||
|
|
||||||
|
// Test if we can access the database (this will create it if needed)
|
||||||
match server.current_storage() {
|
match server.current_storage() {
|
||||||
Ok(_) => Ok(Protocol::SimpleString("OK".to_string())),
|
Ok(_) => Ok(Protocol::SimpleString("OK".to_string())),
|
||||||
Err(e) => Ok(Protocol::err(&e.0)),
|
Err(e) => Ok(Protocol::err(&e.0)),
|
||||||
@@ -1291,6 +1052,9 @@ async fn brpop_cmd(server: &Server, keys: &[String], timeout_secs: f64) -> Resul
|
|||||||
}
|
}
|
||||||
|
|
||||||
async fn lpush_cmd(server: &Server, key: &str, elements: &[String]) -> Result<Protocol, DBError> {
|
async fn lpush_cmd(server: &Server, key: &str, elements: &[String]) -> Result<Protocol, DBError> {
|
||||||
|
if !server.has_write_permission() {
|
||||||
|
return Ok(Protocol::err("ERR write permission denied"));
|
||||||
|
}
|
||||||
match server.current_storage()?.lpush(key, elements.to_vec()) {
|
match server.current_storage()?.lpush(key, elements.to_vec()) {
|
||||||
Ok(len) => {
|
Ok(len) => {
|
||||||
// Attempt to deliver to any blocked BLPOP waiters
|
// Attempt to deliver to any blocked BLPOP waiters
|
||||||
@@ -1422,6 +1186,9 @@ async fn type_cmd(server: &Server, k: &String) -> Result<Protocol, DBError> {
|
|||||||
}
|
}
|
||||||
|
|
||||||
async fn del_cmd(server: &Server, k: &str) -> Result<Protocol, DBError> {
|
async fn del_cmd(server: &Server, k: &str) -> Result<Protocol, DBError> {
|
||||||
|
if !server.has_write_permission() {
|
||||||
|
return Ok(Protocol::err("ERR write permission denied"));
|
||||||
|
}
|
||||||
server.current_storage()?.del(k.to_string())?;
|
server.current_storage()?.del(k.to_string())?;
|
||||||
Ok(Protocol::SimpleString("1".to_string()))
|
Ok(Protocol::SimpleString("1".to_string()))
|
||||||
}
|
}
|
||||||
@@ -1447,6 +1214,9 @@ async fn set_px_cmd(
|
|||||||
}
|
}
|
||||||
|
|
||||||
async fn set_cmd(server: &Server, k: &str, v: &str) -> Result<Protocol, DBError> {
|
async fn set_cmd(server: &Server, k: &str, v: &str) -> Result<Protocol, DBError> {
|
||||||
|
if !server.has_write_permission() {
|
||||||
|
return Ok(Protocol::err("ERR write permission denied"));
|
||||||
|
}
|
||||||
server.current_storage()?.set(k.to_string(), v.to_string())?;
|
server.current_storage()?.set(k.to_string(), v.to_string())?;
|
||||||
Ok(Protocol::SimpleString("OK".to_string()))
|
Ok(Protocol::SimpleString("OK".to_string()))
|
||||||
}
|
}
|
||||||
@@ -1561,6 +1331,9 @@ async fn get_cmd(server: &Server, k: &str) -> Result<Protocol, DBError> {
|
|||||||
|
|
||||||
// Hash command implementations
|
// Hash command implementations
|
||||||
async fn hset_cmd(server: &Server, key: &str, pairs: &[(String, String)]) -> Result<Protocol, DBError> {
|
async fn hset_cmd(server: &Server, key: &str, pairs: &[(String, String)]) -> Result<Protocol, DBError> {
|
||||||
|
if !server.has_write_permission() {
|
||||||
|
return Ok(Protocol::err("ERR write permission denied"));
|
||||||
|
}
|
||||||
let new_fields = server.current_storage()?.hset(key, pairs.to_vec())?;
|
let new_fields = server.current_storage()?.hset(key, pairs.to_vec())?;
|
||||||
Ok(Protocol::SimpleString(new_fields.to_string()))
|
Ok(Protocol::SimpleString(new_fields.to_string()))
|
||||||
}
|
}
|
||||||
@@ -1801,243 +1574,3 @@ fn command_cmd(args: &[String]) -> Result<Protocol, DBError> {
|
|||||||
_ => Ok(Protocol::Array(vec![])),
|
_ => Ok(Protocol::Array(vec![])),
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
// Helper function to create Arrow schema from field specifications
|
|
||||||
fn create_schema_from_fields(dim: usize, fields: &[(String, String)]) -> arrow::datatypes::Schema {
|
|
||||||
let mut schema_fields = Vec::new();
|
|
||||||
|
|
||||||
// Always add the vector field first
|
|
||||||
let vector_field = arrow::datatypes::Field::new(
|
|
||||||
"vector",
|
|
||||||
arrow::datatypes::DataType::FixedSizeList(
|
|
||||||
Arc::new(arrow::datatypes::Field::new("item", arrow::datatypes::DataType::Float32, true)),
|
|
||||||
dim as i32
|
|
||||||
),
|
|
||||||
false
|
|
||||||
);
|
|
||||||
schema_fields.push(vector_field);
|
|
||||||
|
|
||||||
// Add custom fields
|
|
||||||
for (name, field_type) in fields {
|
|
||||||
let data_type = match field_type.to_lowercase().as_str() {
|
|
||||||
"string" | "text" => arrow::datatypes::DataType::Utf8,
|
|
||||||
"int" | "integer" => arrow::datatypes::DataType::Int64,
|
|
||||||
"float" => arrow::datatypes::DataType::Float64,
|
|
||||||
"bool" | "boolean" => arrow::datatypes::DataType::Boolean,
|
|
||||||
_ => arrow::datatypes::DataType::Utf8, // Default to string
|
|
||||||
};
|
|
||||||
schema_fields.push(arrow::datatypes::Field::new(name, data_type, true));
|
|
||||||
}
|
|
||||||
|
|
||||||
arrow::datatypes::Schema::new(schema_fields)
|
|
||||||
}
|
|
||||||
|
|
||||||
// Lance vector database command implementations
|
|
||||||
async fn lance_create_cmd(
|
|
||||||
server: &Server,
|
|
||||||
dataset: &str,
|
|
||||||
dim: usize,
|
|
||||||
schema: &[(String, String)],
|
|
||||||
) -> Result<Protocol, DBError> {
|
|
||||||
match server.lance_store() {
|
|
||||||
Ok(lance_store) => {
|
|
||||||
match lance_store.create_dataset(dataset, create_schema_from_fields(dim, schema)).await {
|
|
||||||
Ok(_) => Ok(Protocol::SimpleString("OK".to_string())),
|
|
||||||
Err(e) => Ok(Protocol::err(&sanitize_error_message(&format!("ERR {}", e)))),
|
|
||||||
}
|
|
||||||
}
|
|
||||||
Err(e) => Ok(Protocol::err(&sanitize_error_message(&format!("ERR Lance store not available: {}", e)))),
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
async fn lance_store_cmd(
|
|
||||||
server: &Server,
|
|
||||||
dataset: &str,
|
|
||||||
text: Option<&str>,
|
|
||||||
image_base64: Option<&str>,
|
|
||||||
metadata: &std::collections::HashMap<String, String>,
|
|
||||||
) -> Result<Protocol, DBError> {
|
|
||||||
match server.lance_store() {
|
|
||||||
Ok(lance_store) => {
|
|
||||||
match lance_store.store_multimodal(server, dataset, text.map(|s| s.to_string()),
|
|
||||||
image_base64.and_then(|s| base64::engine::general_purpose::STANDARD.decode(s).ok()),
|
|
||||||
metadata.clone()).await {
|
|
||||||
Ok(id) => Ok(Protocol::BulkString(id)),
|
|
||||||
Err(e) => Ok(Protocol::err(&sanitize_error_message(&format!("ERR {}", e)))),
|
|
||||||
}
|
|
||||||
}
|
|
||||||
Err(e) => Ok(Protocol::err(&sanitize_error_message(&format!("ERR Lance store not available: {}", e)))),
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
async fn lance_search_cmd(
|
|
||||||
server: &Server,
|
|
||||||
dataset: &str,
|
|
||||||
vector: &[f32],
|
|
||||||
k: usize,
|
|
||||||
nprobes: Option<usize>,
|
|
||||||
refine_factor: Option<usize>,
|
|
||||||
) -> Result<Protocol, DBError> {
|
|
||||||
match server.lance_store() {
|
|
||||||
Ok(lance_store) => {
|
|
||||||
match lance_store.search_vectors(dataset, vector.to_vec(), k, nprobes, refine_factor).await {
|
|
||||||
Ok(results) => {
|
|
||||||
let mut response = Vec::new();
|
|
||||||
for (distance, metadata) in results {
|
|
||||||
let mut item = Vec::new();
|
|
||||||
item.push(Protocol::BulkString("distance".to_string()));
|
|
||||||
item.push(Protocol::BulkString(distance.to_string()));
|
|
||||||
for (key, value) in metadata {
|
|
||||||
item.push(Protocol::BulkString(key));
|
|
||||||
item.push(Protocol::BulkString(value));
|
|
||||||
}
|
|
||||||
response.push(Protocol::Array(item));
|
|
||||||
}
|
|
||||||
Ok(Protocol::Array(response))
|
|
||||||
}
|
|
||||||
Err(e) => Ok(Protocol::err(&sanitize_error_message(&format!("ERR {}", e)))),
|
|
||||||
}
|
|
||||||
}
|
|
||||||
Err(e) => Ok(Protocol::err(&sanitize_error_message(&format!("ERR Lance store not available: {}", e)))),
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
async fn lance_search_text_cmd(
|
|
||||||
server: &Server,
|
|
||||||
dataset: &str,
|
|
||||||
query_text: &str,
|
|
||||||
k: usize,
|
|
||||||
nprobes: Option<usize>,
|
|
||||||
refine_factor: Option<usize>,
|
|
||||||
) -> Result<Protocol, DBError> {
|
|
||||||
match server.lance_store() {
|
|
||||||
Ok(lance_store) => {
|
|
||||||
match lance_store.search_with_text(server, dataset, query_text.to_string(), k, nprobes, refine_factor).await {
|
|
||||||
Ok(results) => {
|
|
||||||
let mut response = Vec::new();
|
|
||||||
for (distance, metadata) in results {
|
|
||||||
let mut item = Vec::new();
|
|
||||||
item.push(Protocol::BulkString("distance".to_string()));
|
|
||||||
item.push(Protocol::BulkString(distance.to_string()));
|
|
||||||
for (key, value) in metadata {
|
|
||||||
item.push(Protocol::BulkString(key));
|
|
||||||
item.push(Protocol::BulkString(value));
|
|
||||||
}
|
|
||||||
response.push(Protocol::Array(item));
|
|
||||||
}
|
|
||||||
Ok(Protocol::Array(response))
|
|
||||||
}
|
|
||||||
Err(e) => Ok(Protocol::err(&sanitize_error_message(&format!("ERR {}", e)))),
|
|
||||||
}
|
|
||||||
}
|
|
||||||
Err(e) => Ok(Protocol::err(&sanitize_error_message(&format!("ERR Lance store not available: {}", e)))),
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Helper function to sanitize error messages for Redis protocol
|
|
||||||
fn sanitize_error_message(msg: &str) -> String {
|
|
||||||
// Remove newlines, carriage returns, and limit length
|
|
||||||
let sanitized = msg
|
|
||||||
.replace('\n', " ")
|
|
||||||
.replace('\r', " ")
|
|
||||||
.replace('\t', " ");
|
|
||||||
|
|
||||||
// Limit to 200 characters to avoid overly long error messages
|
|
||||||
if sanitized.len() > 200 {
|
|
||||||
format!("{}...", &sanitized[..197])
|
|
||||||
} else {
|
|
||||||
sanitized
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
async fn lance_embed_text_cmd(
|
|
||||||
server: &Server,
|
|
||||||
texts: &[String],
|
|
||||||
) -> Result<Protocol, DBError> {
|
|
||||||
match server.lance_store() {
|
|
||||||
Ok(lance_store) => {
|
|
||||||
match lance_store.embed_text(server, texts.to_vec()).await {
|
|
||||||
Ok(embeddings) => {
|
|
||||||
let mut response = Vec::new();
|
|
||||||
for embedding in embeddings {
|
|
||||||
let vector_strings: Vec<Protocol> = embedding
|
|
||||||
.iter()
|
|
||||||
.map(|f| Protocol::BulkString(f.to_string()))
|
|
||||||
.collect();
|
|
||||||
response.push(Protocol::Array(vector_strings));
|
|
||||||
}
|
|
||||||
Ok(Protocol::Array(response))
|
|
||||||
}
|
|
||||||
Err(e) => Ok(Protocol::err(&sanitize_error_message(&format!("ERR {}", e)))),
|
|
||||||
}
|
|
||||||
}
|
|
||||||
Err(e) => Ok(Protocol::err(&sanitize_error_message(&format!("ERR Lance store not available: {}", e)))),
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
async fn lance_create_index_cmd(
|
|
||||||
server: &Server,
|
|
||||||
dataset: &str,
|
|
||||||
index_type: &str,
|
|
||||||
num_partitions: Option<usize>,
|
|
||||||
num_sub_vectors: Option<usize>,
|
|
||||||
) -> Result<Protocol, DBError> {
|
|
||||||
match server.lance_store() {
|
|
||||||
Ok(lance_store) => {
|
|
||||||
match lance_store.create_index(dataset, index_type, num_partitions, num_sub_vectors).await {
|
|
||||||
Ok(_) => Ok(Protocol::SimpleString("OK".to_string())),
|
|
||||||
Err(e) => Ok(Protocol::err(&sanitize_error_message(&format!("ERR {}", e)))),
|
|
||||||
}
|
|
||||||
}
|
|
||||||
Err(e) => Ok(Protocol::err(&sanitize_error_message(&format!("ERR Lance store not available: {}", e)))),
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
async fn lance_list_cmd(server: &Server) -> Result<Protocol, DBError> {
|
|
||||||
match server.lance_store() {
|
|
||||||
Ok(lance_store) => {
|
|
||||||
match lance_store.list_datasets().await {
|
|
||||||
Ok(datasets) => {
|
|
||||||
let response: Vec<Protocol> = datasets
|
|
||||||
.into_iter()
|
|
||||||
.map(Protocol::BulkString)
|
|
||||||
.collect();
|
|
||||||
Ok(Protocol::Array(response))
|
|
||||||
}
|
|
||||||
Err(e) => Ok(Protocol::err(&sanitize_error_message(&format!("ERR {}", e)))),
|
|
||||||
}
|
|
||||||
}
|
|
||||||
Err(e) => Ok(Protocol::err(&sanitize_error_message(&format!("ERR Lance store not available: {}", e)))),
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
async fn lance_drop_cmd(server: &Server, dataset: &str) -> Result<Protocol, DBError> {
|
|
||||||
match server.lance_store() {
|
|
||||||
Ok(lance_store) => {
|
|
||||||
match lance_store.drop_dataset(dataset).await {
|
|
||||||
Ok(_) => Ok(Protocol::SimpleString("OK".to_string())),
|
|
||||||
Err(e) => Ok(Protocol::err(&sanitize_error_message(&format!("ERR {}", e)))),
|
|
||||||
}
|
|
||||||
}
|
|
||||||
Err(e) => Ok(Protocol::err(&sanitize_error_message(&format!("ERR Lance store not available: {}", e)))),
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
async fn lance_info_cmd(server: &Server, dataset: &str) -> Result<Protocol, DBError> {
|
|
||||||
match server.lance_store() {
|
|
||||||
Ok(lance_store) => {
|
|
||||||
match lance_store.get_dataset_info(dataset).await {
|
|
||||||
Ok(info) => {
|
|
||||||
let mut response = Vec::new();
|
|
||||||
for (key, value) in info {
|
|
||||||
response.push(Protocol::BulkString(key));
|
|
||||||
response.push(Protocol::BulkString(value));
|
|
||||||
}
|
|
||||||
Ok(Protocol::Array(response))
|
|
||||||
}
|
|
||||||
Err(e) => Ok(Protocol::err(&sanitize_error_message(&format!("ERR {}", e)))),
|
|
||||||
}
|
|
||||||
}
|
|
||||||
Err(e) => Ok(Protocol::err(&sanitize_error_message(&format!("ERR Lance store not available: {}", e)))),
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
43
src/error.rs
43
src/error.rs
@@ -9,12 +9,6 @@ use bincode;
|
|||||||
#[derive(Debug)]
|
#[derive(Debug)]
|
||||||
pub struct DBError(pub String);
|
pub struct DBError(pub String);
|
||||||
|
|
||||||
impl std::fmt::Display for DBError {
|
|
||||||
fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result {
|
|
||||||
write!(f, "{}", self.0)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
impl From<std::io::Error> for DBError {
|
impl From<std::io::Error> for DBError {
|
||||||
fn from(item: std::io::Error) -> Self {
|
fn from(item: std::io::Error) -> Self {
|
||||||
DBError(item.to_string().clone())
|
DBError(item.to_string().clone())
|
||||||
@@ -98,40 +92,3 @@ impl From<chacha20poly1305::Error> for DBError {
|
|||||||
DBError(item.to_string())
|
DBError(item.to_string())
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
// Lance and related dependencies error handling
|
|
||||||
impl From<lance::Error> for DBError {
|
|
||||||
fn from(item: lance::Error) -> Self {
|
|
||||||
DBError(item.to_string())
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
impl From<arrow::error::ArrowError> for DBError {
|
|
||||||
fn from(item: arrow::error::ArrowError) -> Self {
|
|
||||||
DBError(item.to_string())
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
impl From<reqwest::Error> for DBError {
|
|
||||||
fn from(item: reqwest::Error) -> Self {
|
|
||||||
DBError(item.to_string())
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
impl From<image::ImageError> for DBError {
|
|
||||||
fn from(item: image::ImageError) -> Self {
|
|
||||||
DBError(item.to_string())
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
impl From<uuid::Error> for DBError {
|
|
||||||
fn from(item: uuid::Error) -> Self {
|
|
||||||
DBError(item.to_string())
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
impl From<base64::DecodeError> for DBError {
|
|
||||||
fn from(item: base64::DecodeError) -> Self {
|
|
||||||
DBError(item.to_string())
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
@@ -1,609 +0,0 @@
|
|||||||
use std::collections::HashMap;
|
|
||||||
use std::path::PathBuf;
|
|
||||||
use std::sync::Arc;
|
|
||||||
use tokio::sync::RwLock;
|
|
||||||
|
|
||||||
use arrow::array::{Float32Array, StringArray, ArrayRef, FixedSizeListArray, Array};
|
|
||||||
use arrow::datatypes::{DataType, Field, Schema, SchemaRef};
|
|
||||||
use arrow::record_batch::{RecordBatch, RecordBatchReader};
|
|
||||||
use arrow::error::ArrowError;
|
|
||||||
use lance::dataset::{Dataset, WriteParams, WriteMode};
|
|
||||||
use lance::index::vector::VectorIndexParams;
|
|
||||||
use lance_index::vector::pq::PQBuildParams;
|
|
||||||
use lance_index::vector::ivf::IvfBuildParams;
|
|
||||||
use lance_index::DatasetIndexExt;
|
|
||||||
use lance_linalg::distance::MetricType;
|
|
||||||
use futures::TryStreamExt;
|
|
||||||
use base64::Engine;
|
|
||||||
|
|
||||||
use serde::{Deserialize, Serialize};
|
|
||||||
use crate::error::DBError;
|
|
||||||
|
|
||||||
// Simple RecordBatchReader implementation for Vec<RecordBatch>
|
|
||||||
struct VecRecordBatchReader {
|
|
||||||
batches: std::vec::IntoIter<Result<RecordBatch, ArrowError>>,
|
|
||||||
}
|
|
||||||
|
|
||||||
impl VecRecordBatchReader {
|
|
||||||
fn new(batches: Vec<RecordBatch>) -> Self {
|
|
||||||
let result_batches = batches.into_iter().map(Ok).collect::<Vec<_>>();
|
|
||||||
Self {
|
|
||||||
batches: result_batches.into_iter(),
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
impl Iterator for VecRecordBatchReader {
|
|
||||||
type Item = Result<RecordBatch, ArrowError>;
|
|
||||||
|
|
||||||
fn next(&mut self) -> Option<Self::Item> {
|
|
||||||
self.batches.next()
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
impl RecordBatchReader for VecRecordBatchReader {
|
|
||||||
fn schema(&self) -> SchemaRef {
|
|
||||||
// This is a simplified implementation - in practice you'd want to store the schema
|
|
||||||
Arc::new(Schema::empty())
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
#[derive(Debug, Serialize, Deserialize)]
|
|
||||||
struct EmbeddingRequest {
|
|
||||||
texts: Option<Vec<String>>,
|
|
||||||
images: Option<Vec<String>>, // base64 encoded
|
|
||||||
model: Option<String>,
|
|
||||||
}
|
|
||||||
|
|
||||||
#[derive(Debug, Serialize, Deserialize)]
|
|
||||||
struct EmbeddingResponse {
|
|
||||||
embeddings: Vec<Vec<f32>>,
|
|
||||||
model: String,
|
|
||||||
usage: Option<HashMap<String, u32>>,
|
|
||||||
}
|
|
||||||
|
|
||||||
// Ollama-specific request/response structures
|
|
||||||
#[derive(Debug, Serialize, Deserialize)]
|
|
||||||
struct OllamaEmbeddingRequest {
|
|
||||||
model: String,
|
|
||||||
prompt: String,
|
|
||||||
}
|
|
||||||
|
|
||||||
#[derive(Debug, Serialize, Deserialize)]
|
|
||||||
struct OllamaEmbeddingResponse {
|
|
||||||
embedding: Vec<f32>,
|
|
||||||
}
|
|
||||||
|
|
||||||
pub struct LanceStore {
|
|
||||||
datasets: Arc<RwLock<HashMap<String, Arc<Dataset>>>>,
|
|
||||||
data_dir: PathBuf,
|
|
||||||
http_client: reqwest::Client,
|
|
||||||
}
|
|
||||||
|
|
||||||
impl LanceStore {
|
|
||||||
pub async fn new(data_dir: PathBuf) -> Result<Self, DBError> {
|
|
||||||
// Create data directory if it doesn't exist
|
|
||||||
std::fs::create_dir_all(&data_dir)
|
|
||||||
.map_err(|e| DBError(format!("Failed to create Lance data directory: {}", e)))?;
|
|
||||||
|
|
||||||
let http_client = reqwest::Client::builder()
|
|
||||||
.timeout(std::time::Duration::from_secs(30))
|
|
||||||
.build()
|
|
||||||
.map_err(|e| DBError(format!("Failed to create HTTP client: {}", e)))?;
|
|
||||||
|
|
||||||
Ok(Self {
|
|
||||||
datasets: Arc::new(RwLock::new(HashMap::new())),
|
|
||||||
data_dir,
|
|
||||||
http_client,
|
|
||||||
})
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Get embedding service URL from Redis config, default to local Ollama
|
|
||||||
async fn get_embedding_url(&self, server: &crate::server::Server) -> Result<String, DBError> {
|
|
||||||
// Get the embedding URL from Redis config directly from storage
|
|
||||||
let storage = server.current_storage()?;
|
|
||||||
match storage.hget("config:core:aiembed", "url")? {
|
|
||||||
Some(url) => Ok(url),
|
|
||||||
None => Ok("http://localhost:11434".to_string()), // Default to local Ollama
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Check if we're using Ollama (default) or custom embedding service
|
|
||||||
async fn is_ollama_service(&self, server: &crate::server::Server) -> Result<bool, DBError> {
|
|
||||||
let url = self.get_embedding_url(server).await?;
|
|
||||||
Ok(url.contains("localhost:11434") || url.contains("127.0.0.1:11434"))
|
|
||||||
}
|
|
||||||
|
|
||||||
/// Call external embedding service (Ollama or custom)
|
|
||||||
async fn call_embedding_service(
|
|
||||||
&self,
|
|
||||||
server: &crate::server::Server,
|
|
||||||
texts: Option<Vec<String>>,
|
|
||||||
images: Option<Vec<String>>,
|
|
||||||
) -> Result<Vec<Vec<f32>>, DBError> {
|
|
||||||
let base_url = self.get_embedding_url(server).await?;
|
|
||||||
let is_ollama = self.is_ollama_service(server).await?;
|
|
||||||
|
|
||||||
if is_ollama {
|
|
||||||
// Use Ollama API format
|
|
||||||
if let Some(texts) = texts {
|
|
||||||
let mut embeddings = Vec::new();
|
|
||||||
for text in texts {
|
|
||||||
let url = format!("{}/api/embeddings", base_url);
|
|
||||||
let request = OllamaEmbeddingRequest {
|
|
||||||
model: "nomic-embed-text".to_string(),
|
|
||||||
prompt: text,
|
|
||||||
};
|
|
||||||
|
|
||||||
let response = self.http_client
|
|
||||||
.post(&url)
|
|
||||||
.json(&request)
|
|
||||||
.send()
|
|
||||||
.await
|
|
||||||
.map_err(|e| DBError(format!("Failed to call Ollama embedding service: {}", e)))?;
|
|
||||||
|
|
||||||
if !response.status().is_success() {
|
|
||||||
let status = response.status();
|
|
||||||
let error_text = response.text().await.unwrap_or_default();
|
|
||||||
return Err(DBError(format!(
|
|
||||||
"Ollama embedding service returned error {}: {}",
|
|
||||||
status, error_text
|
|
||||||
)));
|
|
||||||
}
|
|
||||||
|
|
||||||
let ollama_response: OllamaEmbeddingResponse = response
|
|
||||||
.json()
|
|
||||||
.await
|
|
||||||
.map_err(|e| DBError(format!("Failed to parse Ollama embedding response: {}", e)))?;
|
|
||||||
|
|
||||||
embeddings.push(ollama_response.embedding);
|
|
||||||
}
|
|
||||||
Ok(embeddings)
|
|
||||||
} else if let Some(_images) = images {
|
|
||||||
// Ollama doesn't support image embeddings with this API yet
|
|
||||||
Err(DBError("Image embeddings not supported with Ollama. Please configure a custom embedding service.".to_string()))
|
|
||||||
} else {
|
|
||||||
Err(DBError("No text or images provided for embedding".to_string()))
|
|
||||||
}
|
|
||||||
} else {
|
|
||||||
// Use custom embedding service API format
|
|
||||||
let request = EmbeddingRequest {
|
|
||||||
texts,
|
|
||||||
images,
|
|
||||||
model: None, // Let the service use its default
|
|
||||||
};
|
|
||||||
|
|
||||||
let response = self.http_client
|
|
||||||
.post(&base_url)
|
|
||||||
.json(&request)
|
|
||||||
.send()
|
|
||||||
.await
|
|
||||||
.map_err(|e| DBError(format!("Failed to call embedding service: {}", e)))?;
|
|
||||||
|
|
||||||
if !response.status().is_success() {
|
|
||||||
let status = response.status();
|
|
||||||
let error_text = response.text().await.unwrap_or_default();
|
|
||||||
return Err(DBError(format!(
|
|
||||||
"Embedding service returned error {}: {}",
|
|
||||||
status, error_text
|
|
||||||
)));
|
|
||||||
}
|
|
||||||
|
|
||||||
let embedding_response: EmbeddingResponse = response
|
|
||||||
.json()
|
|
||||||
.await
|
|
||||||
.map_err(|e| DBError(format!("Failed to parse embedding response: {}", e)))?;
|
|
||||||
|
|
||||||
Ok(embedding_response.embeddings)
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
pub async fn embed_text(
|
|
||||||
&self,
|
|
||||||
server: &crate::server::Server,
|
|
||||||
texts: Vec<String>
|
|
||||||
) -> Result<Vec<Vec<f32>>, DBError> {
|
|
||||||
if texts.is_empty() {
|
|
||||||
return Ok(Vec::new());
|
|
||||||
}
|
|
||||||
|
|
||||||
self.call_embedding_service(server, Some(texts), None).await
|
|
||||||
}
|
|
||||||
|
|
||||||
pub async fn embed_image(
|
|
||||||
&self,
|
|
||||||
server: &crate::server::Server,
|
|
||||||
image_bytes: Vec<u8>
|
|
||||||
) -> Result<Vec<f32>, DBError> {
|
|
||||||
// Convert image bytes to base64
|
|
||||||
let base64_image = base64::engine::general_purpose::STANDARD.encode(&image_bytes);
|
|
||||||
|
|
||||||
let embeddings = self.call_embedding_service(
|
|
||||||
server,
|
|
||||||
None,
|
|
||||||
Some(vec![base64_image])
|
|
||||||
).await?;
|
|
||||||
|
|
||||||
embeddings.into_iter()
|
|
||||||
.next()
|
|
||||||
.ok_or_else(|| DBError("No embedding returned for image".to_string()))
|
|
||||||
}
|
|
||||||
|
|
||||||
pub async fn create_dataset(
|
|
||||||
&self,
|
|
||||||
name: &str,
|
|
||||||
schema: Schema,
|
|
||||||
) -> Result<(), DBError> {
|
|
||||||
let dataset_path = self.data_dir.join(format!("{}.lance", name));
|
|
||||||
|
|
||||||
// Create empty dataset with schema
|
|
||||||
let write_params = WriteParams {
|
|
||||||
mode: WriteMode::Create,
|
|
||||||
..Default::default()
|
|
||||||
};
|
|
||||||
|
|
||||||
// Create an empty RecordBatch with the schema
|
|
||||||
let empty_batch = RecordBatch::new_empty(Arc::new(schema));
|
|
||||||
|
|
||||||
// Use RecordBatchReader for Lance 0.33
|
|
||||||
let reader = VecRecordBatchReader::new(vec![empty_batch]);
|
|
||||||
let dataset = Dataset::write(
|
|
||||||
reader,
|
|
||||||
dataset_path.to_str().unwrap(),
|
|
||||||
Some(write_params)
|
|
||||||
).await
|
|
||||||
.map_err(|e| DBError(format!("Failed to create dataset: {}", e)))?;
|
|
||||||
|
|
||||||
let mut datasets = self.datasets.write().await;
|
|
||||||
datasets.insert(name.to_string(), Arc::new(dataset));
|
|
||||||
|
|
||||||
Ok(())
|
|
||||||
}
|
|
||||||
|
|
||||||
pub async fn write_vectors(
|
|
||||||
&self,
|
|
||||||
dataset_name: &str,
|
|
||||||
vectors: Vec<Vec<f32>>,
|
|
||||||
metadata: Option<HashMap<String, Vec<String>>>,
|
|
||||||
) -> Result<usize, DBError> {
|
|
||||||
let dataset_path = self.data_dir.join(format!("{}.lance", dataset_name));
|
|
||||||
|
|
||||||
// Open or get cached dataset
|
|
||||||
let _dataset = self.get_or_open_dataset(dataset_name).await?;
|
|
||||||
|
|
||||||
// Build RecordBatch
|
|
||||||
let num_vectors = vectors.len();
|
|
||||||
if num_vectors == 0 {
|
|
||||||
return Ok(0);
|
|
||||||
}
|
|
||||||
|
|
||||||
let dim = vectors.first()
|
|
||||||
.ok_or_else(|| DBError("Empty vectors".to_string()))?
|
|
||||||
.len();
|
|
||||||
|
|
||||||
// Flatten vectors
|
|
||||||
let flat_vectors: Vec<f32> = vectors.into_iter().flatten().collect();
|
|
||||||
let values_array = Float32Array::from(flat_vectors);
|
|
||||||
let field = Arc::new(Field::new("item", DataType::Float32, true));
|
|
||||||
let vector_array = FixedSizeListArray::try_new(
|
|
||||||
field,
|
|
||||||
dim as i32,
|
|
||||||
Arc::new(values_array),
|
|
||||||
None
|
|
||||||
).map_err(|e| DBError(format!("Failed to create vector array: {}", e)))?;
|
|
||||||
|
|
||||||
let mut arrays: Vec<ArrayRef> = vec![Arc::new(vector_array)];
|
|
||||||
let mut fields = vec![Field::new(
|
|
||||||
"vector",
|
|
||||||
DataType::FixedSizeList(
|
|
||||||
Arc::new(Field::new("item", DataType::Float32, true)),
|
|
||||||
dim as i32
|
|
||||||
),
|
|
||||||
false
|
|
||||||
)];
|
|
||||||
|
|
||||||
// Add metadata columns if provided
|
|
||||||
if let Some(metadata) = metadata {
|
|
||||||
for (key, values) in metadata {
|
|
||||||
if values.len() != num_vectors {
|
|
||||||
return Err(DBError(format!(
|
|
||||||
"Metadata field '{}' has {} values but expected {}",
|
|
||||||
key, values.len(), num_vectors
|
|
||||||
)));
|
|
||||||
}
|
|
||||||
let array = StringArray::from(values);
|
|
||||||
arrays.push(Arc::new(array));
|
|
||||||
fields.push(Field::new(&key, DataType::Utf8, true));
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
let schema = Arc::new(Schema::new(fields));
|
|
||||||
let batch = RecordBatch::try_new(schema, arrays)
|
|
||||||
.map_err(|e| DBError(format!("Failed to create RecordBatch: {}", e)))?;
|
|
||||||
|
|
||||||
// Append to dataset
|
|
||||||
let write_params = WriteParams {
|
|
||||||
mode: WriteMode::Append,
|
|
||||||
..Default::default()
|
|
||||||
};
|
|
||||||
|
|
||||||
let reader = VecRecordBatchReader::new(vec![batch]);
|
|
||||||
Dataset::write(
|
|
||||||
reader,
|
|
||||||
dataset_path.to_str().unwrap(),
|
|
||||||
Some(write_params)
|
|
||||||
).await
|
|
||||||
.map_err(|e| DBError(format!("Failed to write to dataset: {}", e)))?;
|
|
||||||
|
|
||||||
// Refresh cached dataset
|
|
||||||
let mut datasets = self.datasets.write().await;
|
|
||||||
datasets.remove(dataset_name);
|
|
||||||
|
|
||||||
Ok(num_vectors)
|
|
||||||
}
|
|
||||||
|
|
||||||
pub async fn search_vectors(
|
|
||||||
&self,
|
|
||||||
dataset_name: &str,
|
|
||||||
query_vector: Vec<f32>,
|
|
||||||
k: usize,
|
|
||||||
nprobes: Option<usize>,
|
|
||||||
_refine_factor: Option<usize>,
|
|
||||||
) -> Result<Vec<(f32, HashMap<String, String>)>, DBError> {
|
|
||||||
let dataset = self.get_or_open_dataset(dataset_name).await?;
|
|
||||||
|
|
||||||
// Build query
|
|
||||||
let query_array = Float32Array::from(query_vector.clone());
|
|
||||||
let mut query = dataset.scan();
|
|
||||||
query.nearest(
|
|
||||||
"vector",
|
|
||||||
&query_array,
|
|
||||||
k,
|
|
||||||
).map_err(|e| DBError(format!("Failed to build search query: {}", e)))?;
|
|
||||||
|
|
||||||
if let Some(nprobes) = nprobes {
|
|
||||||
query.nprobs(nprobes);
|
|
||||||
}
|
|
||||||
|
|
||||||
// Note: refine_factor might not be available in this Lance version
|
|
||||||
// if let Some(refine) = refine_factor {
|
|
||||||
// query.refine_factor(refine);
|
|
||||||
// }
|
|
||||||
|
|
||||||
// Execute search
|
|
||||||
let results = query
|
|
||||||
.try_into_stream()
|
|
||||||
.await
|
|
||||||
.map_err(|e| DBError(format!("Failed to execute search: {}", e)))?
|
|
||||||
.try_collect::<Vec<_>>()
|
|
||||||
.await
|
|
||||||
.map_err(|e| DBError(format!("Failed to collect results: {}", e)))?;
|
|
||||||
|
|
||||||
// Process results
|
|
||||||
let mut output = Vec::new();
|
|
||||||
for batch in results {
|
|
||||||
// Get distances
|
|
||||||
let distances = batch
|
|
||||||
.column_by_name("_distance")
|
|
||||||
.ok_or_else(|| DBError("No distance column".to_string()))?
|
|
||||||
.as_any()
|
|
||||||
.downcast_ref::<Float32Array>()
|
|
||||||
.ok_or_else(|| DBError("Invalid distance type".to_string()))?;
|
|
||||||
|
|
||||||
// Get metadata
|
|
||||||
for i in 0..batch.num_rows() {
|
|
||||||
let distance = distances.value(i);
|
|
||||||
let mut metadata = HashMap::new();
|
|
||||||
|
|
||||||
for field in batch.schema().fields() {
|
|
||||||
if field.name() != "vector" && field.name() != "_distance" {
|
|
||||||
if let Some(col) = batch.column_by_name(field.name()) {
|
|
||||||
if let Some(str_array) = col.as_any().downcast_ref::<StringArray>() {
|
|
||||||
if !str_array.is_null(i) {
|
|
||||||
metadata.insert(
|
|
||||||
field.name().to_string(),
|
|
||||||
str_array.value(i).to_string()
|
|
||||||
);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
output.push((distance, metadata));
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
Ok(output)
|
|
||||||
}
|
|
||||||
|
|
||||||
pub async fn store_multimodal(
|
|
||||||
&self,
|
|
||||||
server: &crate::server::Server,
|
|
||||||
dataset_name: &str,
|
|
||||||
text: Option<String>,
|
|
||||||
image_bytes: Option<Vec<u8>>,
|
|
||||||
metadata: HashMap<String, String>,
|
|
||||||
) -> Result<String, DBError> {
|
|
||||||
// Generate ID
|
|
||||||
let id = uuid::Uuid::new_v4().to_string();
|
|
||||||
|
|
||||||
// Generate embeddings using external service
|
|
||||||
let embedding = if let Some(text) = text.as_ref() {
|
|
||||||
self.embed_text(server, vec![text.clone()]).await?
|
|
||||||
.into_iter()
|
|
||||||
.next()
|
|
||||||
.ok_or_else(|| DBError("No embedding returned".to_string()))?
|
|
||||||
} else if let Some(img) = image_bytes.as_ref() {
|
|
||||||
self.embed_image(server, img.clone()).await?
|
|
||||||
} else {
|
|
||||||
return Err(DBError("No text or image provided".to_string()));
|
|
||||||
};
|
|
||||||
|
|
||||||
// Prepare metadata
|
|
||||||
let mut full_metadata = metadata;
|
|
||||||
full_metadata.insert("id".to_string(), id.clone());
|
|
||||||
if let Some(text) = text {
|
|
||||||
full_metadata.insert("text".to_string(), text);
|
|
||||||
}
|
|
||||||
if let Some(img) = image_bytes {
|
|
||||||
full_metadata.insert("image_base64".to_string(), base64::engine::general_purpose::STANDARD.encode(img));
|
|
||||||
}
|
|
||||||
|
|
||||||
// Convert metadata to column vectors
|
|
||||||
let mut metadata_cols = HashMap::new();
|
|
||||||
for (key, value) in full_metadata {
|
|
||||||
metadata_cols.insert(key, vec![value]);
|
|
||||||
}
|
|
||||||
|
|
||||||
// Write to dataset
|
|
||||||
self.write_vectors(dataset_name, vec![embedding], Some(metadata_cols)).await?;
|
|
||||||
|
|
||||||
Ok(id)
|
|
||||||
}
|
|
||||||
|
|
||||||
pub async fn search_with_text(
|
|
||||||
&self,
|
|
||||||
server: &crate::server::Server,
|
|
||||||
dataset_name: &str,
|
|
||||||
query_text: String,
|
|
||||||
k: usize,
|
|
||||||
nprobes: Option<usize>,
|
|
||||||
refine_factor: Option<usize>,
|
|
||||||
) -> Result<Vec<(f32, HashMap<String, String>)>, DBError> {
|
|
||||||
// Embed the query text using external service
|
|
||||||
let embeddings = self.embed_text(server, vec![query_text]).await?;
|
|
||||||
let query_vector = embeddings.into_iter()
|
|
||||||
.next()
|
|
||||||
.ok_or_else(|| DBError("No embedding returned for query".to_string()))?;
|
|
||||||
|
|
||||||
// Search with the embedding
|
|
||||||
self.search_vectors(dataset_name, query_vector, k, nprobes, refine_factor).await
|
|
||||||
}
|
|
||||||
|
|
||||||
pub async fn create_index(
|
|
||||||
&self,
|
|
||||||
dataset_name: &str,
|
|
||||||
index_type: &str,
|
|
||||||
num_partitions: Option<usize>,
|
|
||||||
num_sub_vectors: Option<usize>,
|
|
||||||
) -> Result<(), DBError> {
|
|
||||||
let _dataset = self.get_or_open_dataset(dataset_name).await?;
|
|
||||||
|
|
||||||
match index_type.to_uppercase().as_str() {
|
|
||||||
"IVF_PQ" => {
|
|
||||||
let ivf_params = IvfBuildParams {
|
|
||||||
num_partitions: num_partitions.unwrap_or(256),
|
|
||||||
..Default::default()
|
|
||||||
};
|
|
||||||
let pq_params = PQBuildParams {
|
|
||||||
num_sub_vectors: num_sub_vectors.unwrap_or(16),
|
|
||||||
..Default::default()
|
|
||||||
};
|
|
||||||
let params = VectorIndexParams::with_ivf_pq_params(
|
|
||||||
MetricType::L2,
|
|
||||||
ivf_params,
|
|
||||||
pq_params,
|
|
||||||
);
|
|
||||||
|
|
||||||
// Get a mutable reference to the dataset
|
|
||||||
let mut dataset_mut = Dataset::open(self.data_dir.join(format!("{}.lance", dataset_name)).to_str().unwrap())
|
|
||||||
.await
|
|
||||||
.map_err(|e| DBError(format!("Failed to open dataset for indexing: {}", e)))?;
|
|
||||||
|
|
||||||
dataset_mut.create_index(
|
|
||||||
&["vector"],
|
|
||||||
lance_index::IndexType::Vector,
|
|
||||||
None,
|
|
||||||
¶ms,
|
|
||||||
true
|
|
||||||
).await
|
|
||||||
.map_err(|e| DBError(format!("Failed to create index: {}", e)))?;
|
|
||||||
}
|
|
||||||
_ => return Err(DBError(format!("Unsupported index type: {}", index_type))),
|
|
||||||
}
|
|
||||||
|
|
||||||
Ok(())
|
|
||||||
}
|
|
||||||
|
|
||||||
async fn get_or_open_dataset(&self, name: &str) -> Result<Arc<Dataset>, DBError> {
|
|
||||||
let mut datasets = self.datasets.write().await;
|
|
||||||
|
|
||||||
if let Some(dataset) = datasets.get(name) {
|
|
||||||
return Ok(dataset.clone());
|
|
||||||
}
|
|
||||||
|
|
||||||
let dataset_path = self.data_dir.join(format!("{}.lance", name));
|
|
||||||
if !dataset_path.exists() {
|
|
||||||
return Err(DBError(format!("Dataset '{}' does not exist", name)));
|
|
||||||
}
|
|
||||||
|
|
||||||
let dataset = Dataset::open(dataset_path.to_str().unwrap())
|
|
||||||
.await
|
|
||||||
.map_err(|e| DBError(format!("Failed to open dataset: {}", e)))?;
|
|
||||||
|
|
||||||
let dataset = Arc::new(dataset);
|
|
||||||
datasets.insert(name.to_string(), dataset.clone());
|
|
||||||
|
|
||||||
Ok(dataset)
|
|
||||||
}
|
|
||||||
|
|
||||||
pub async fn list_datasets(&self) -> Result<Vec<String>, DBError> {
|
|
||||||
let mut datasets = Vec::new();
|
|
||||||
|
|
||||||
let entries = std::fs::read_dir(&self.data_dir)
|
|
||||||
.map_err(|e| DBError(format!("Failed to read data directory: {}", e)))?;
|
|
||||||
|
|
||||||
for entry in entries {
|
|
||||||
let entry = entry.map_err(|e| DBError(format!("Failed to read entry: {}", e)))?;
|
|
||||||
let path = entry.path();
|
|
||||||
|
|
||||||
if path.is_dir() {
|
|
||||||
if let Some(name) = path.file_name() {
|
|
||||||
if let Some(name_str) = name.to_str() {
|
|
||||||
if name_str.ends_with(".lance") {
|
|
||||||
let dataset_name = name_str.trim_end_matches(".lance");
|
|
||||||
datasets.push(dataset_name.to_string());
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
Ok(datasets)
|
|
||||||
}
|
|
||||||
|
|
||||||
pub async fn drop_dataset(&self, name: &str) -> Result<(), DBError> {
|
|
||||||
// Remove from cache
|
|
||||||
let mut datasets = self.datasets.write().await;
|
|
||||||
datasets.remove(name);
|
|
||||||
|
|
||||||
// Delete from disk
|
|
||||||
let dataset_path = self.data_dir.join(format!("{}.lance", name));
|
|
||||||
if dataset_path.exists() {
|
|
||||||
std::fs::remove_dir_all(dataset_path)
|
|
||||||
.map_err(|e| DBError(format!("Failed to delete dataset: {}", e)))?;
|
|
||||||
}
|
|
||||||
|
|
||||||
Ok(())
|
|
||||||
}
|
|
||||||
|
|
||||||
pub async fn get_dataset_info(&self, name: &str) -> Result<HashMap<String, String>, DBError> {
|
|
||||||
let dataset = self.get_or_open_dataset(name).await?;
|
|
||||||
|
|
||||||
let mut info = HashMap::new();
|
|
||||||
info.insert("name".to_string(), name.to_string());
|
|
||||||
info.insert("version".to_string(), dataset.version().version.to_string());
|
|
||||||
info.insert("num_rows".to_string(), dataset.count_rows(None).await?.to_string());
|
|
||||||
|
|
||||||
// Get schema info
|
|
||||||
let schema = dataset.schema();
|
|
||||||
let fields: Vec<String> = schema.fields
|
|
||||||
.iter()
|
|
||||||
.map(|f| format!("{}:{}", f.name, f.data_type()))
|
|
||||||
.collect();
|
|
||||||
info.insert("schema".to_string(), fields.join(", "));
|
|
||||||
|
|
||||||
Ok(info)
|
|
||||||
}
|
|
||||||
}
|
|
@@ -2,9 +2,10 @@ pub mod age; // NEW
|
|||||||
pub mod cmd;
|
pub mod cmd;
|
||||||
pub mod crypto;
|
pub mod crypto;
|
||||||
pub mod error;
|
pub mod error;
|
||||||
pub mod lance_store; // Add Lance store module
|
|
||||||
pub mod options;
|
pub mod options;
|
||||||
pub mod protocol;
|
pub mod protocol;
|
||||||
|
pub mod rpc;
|
||||||
|
pub mod rpc_server;
|
||||||
pub mod server;
|
pub mod server;
|
||||||
pub mod storage;
|
pub mod storage;
|
||||||
pub mod storage_trait; // Add this
|
pub mod storage_trait; // Add this
|
||||||
|
37
src/main.rs
37
src/main.rs
@@ -3,6 +3,7 @@
|
|||||||
use tokio::net::TcpListener;
|
use tokio::net::TcpListener;
|
||||||
|
|
||||||
use herodb::server;
|
use herodb::server;
|
||||||
|
use herodb::rpc_server;
|
||||||
|
|
||||||
use clap::Parser;
|
use clap::Parser;
|
||||||
|
|
||||||
@@ -31,6 +32,14 @@ struct Args {
|
|||||||
#[arg(long)]
|
#[arg(long)]
|
||||||
encrypt: bool,
|
encrypt: bool,
|
||||||
|
|
||||||
|
/// Enable RPC management server
|
||||||
|
#[arg(long)]
|
||||||
|
enable_rpc: bool,
|
||||||
|
|
||||||
|
/// RPC server port (default: 8080)
|
||||||
|
#[arg(long, default_value = "8080")]
|
||||||
|
rpc_port: u16,
|
||||||
|
|
||||||
/// Use the sled backend
|
/// Use the sled backend
|
||||||
#[arg(long)]
|
#[arg(long)]
|
||||||
sled: bool,
|
sled: bool,
|
||||||
@@ -50,7 +59,7 @@ async fn main() {
|
|||||||
|
|
||||||
// new DB option
|
// new DB option
|
||||||
let option = herodb::options::DBOption {
|
let option = herodb::options::DBOption {
|
||||||
dir: args.dir,
|
dir: args.dir.clone(),
|
||||||
port,
|
port,
|
||||||
debug: args.debug,
|
debug: args.debug,
|
||||||
encryption_key: args.encryption_key,
|
encryption_key: args.encryption_key,
|
||||||
@@ -62,12 +71,36 @@ async fn main() {
|
|||||||
},
|
},
|
||||||
};
|
};
|
||||||
|
|
||||||
|
let backend = option.backend.clone();
|
||||||
|
|
||||||
// new server
|
// new server
|
||||||
let server = server::Server::new(option).await;
|
let mut server = server::Server::new(option).await;
|
||||||
|
|
||||||
|
// Initialize the default database storage
|
||||||
|
let _ = server.current_storage();
|
||||||
|
|
||||||
// Add a small delay to ensure the port is ready
|
// Add a small delay to ensure the port is ready
|
||||||
tokio::time::sleep(std::time::Duration::from_millis(100)).await;
|
tokio::time::sleep(std::time::Duration::from_millis(100)).await;
|
||||||
|
|
||||||
|
// Start RPC server if enabled
|
||||||
|
let rpc_handle = if args.enable_rpc {
|
||||||
|
let rpc_addr = format!("127.0.0.1:{}", args.rpc_port).parse().unwrap();
|
||||||
|
let base_dir = args.dir.clone();
|
||||||
|
|
||||||
|
match rpc_server::start_rpc_server(rpc_addr, base_dir, backend).await {
|
||||||
|
Ok(handle) => {
|
||||||
|
println!("RPC management server started on port {}", args.rpc_port);
|
||||||
|
Some(handle)
|
||||||
|
}
|
||||||
|
Err(e) => {
|
||||||
|
eprintln!("Failed to start RPC server: {}", e);
|
||||||
|
None
|
||||||
|
}
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
None
|
||||||
|
};
|
||||||
|
|
||||||
// accept new connections
|
// accept new connections
|
||||||
loop {
|
loop {
|
||||||
let stream = listener.accept().await;
|
let stream = listener.accept().await;
|
||||||
|
634
src/rpc.rs
Normal file
634
src/rpc.rs
Normal file
@@ -0,0 +1,634 @@
|
|||||||
|
use std::collections::HashMap;
|
||||||
|
use std::sync::Arc;
|
||||||
|
use tokio::sync::RwLock;
|
||||||
|
use jsonrpsee::{core::RpcResult, proc_macros::rpc};
|
||||||
|
use serde::{Deserialize, Serialize};
|
||||||
|
use sha2::{Digest, Sha256};
|
||||||
|
|
||||||
|
use crate::server::Server;
|
||||||
|
use crate::options::DBOption;
|
||||||
|
|
||||||
|
/// Database backend types
|
||||||
|
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||||
|
pub enum BackendType {
|
||||||
|
Redb,
|
||||||
|
Sled,
|
||||||
|
// Future: InMemory, Custom(String)
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Database configuration
|
||||||
|
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||||
|
pub struct DatabaseConfig {
|
||||||
|
pub name: Option<String>,
|
||||||
|
pub storage_path: Option<String>,
|
||||||
|
pub max_size: Option<u64>,
|
||||||
|
pub redis_version: Option<String>,
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Database information returned by metadata queries
|
||||||
|
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||||
|
pub struct DatabaseInfo {
|
||||||
|
pub id: u64,
|
||||||
|
pub name: Option<String>,
|
||||||
|
pub backend: BackendType,
|
||||||
|
pub encrypted: bool,
|
||||||
|
pub redis_version: Option<String>,
|
||||||
|
pub storage_path: Option<String>,
|
||||||
|
pub size_on_disk: Option<u64>,
|
||||||
|
pub key_count: Option<u64>,
|
||||||
|
pub created_at: u64,
|
||||||
|
pub last_access: Option<u64>,
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Access permissions for database keys
|
||||||
|
#[derive(Debug, Clone, Serialize, Deserialize, PartialEq)]
|
||||||
|
pub enum Permissions {
|
||||||
|
Read,
|
||||||
|
ReadWrite,
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Access key information
|
||||||
|
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||||
|
pub struct AccessKey {
|
||||||
|
pub hash: String,
|
||||||
|
pub permissions: Permissions,
|
||||||
|
pub created_at: u64,
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Database metadata containing access keys
|
||||||
|
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||||
|
pub struct DatabaseMeta {
|
||||||
|
pub public: bool,
|
||||||
|
pub keys: HashMap<String, AccessKey>,
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Access key information returned by RPC
|
||||||
|
#[derive(Debug, Clone, Serialize, Deserialize)]
|
||||||
|
pub struct AccessKeyInfo {
|
||||||
|
pub hash: String,
|
||||||
|
pub permissions: Permissions,
|
||||||
|
pub created_at: u64,
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Hash a plaintext key using SHA-256
|
||||||
|
pub fn hash_key(key: &str) -> String {
|
||||||
|
let mut hasher = Sha256::new();
|
||||||
|
hasher.update(key.as_bytes());
|
||||||
|
format!("{:x}", hasher.finalize())
|
||||||
|
}
|
||||||
|
|
||||||
|
/// RPC trait for HeroDB management
|
||||||
|
#[rpc(server, client, namespace = "herodb")]
|
||||||
|
pub trait Rpc {
|
||||||
|
/// Create a new database with specified configuration
|
||||||
|
#[method(name = "createDatabase")]
|
||||||
|
async fn create_database(
|
||||||
|
&self,
|
||||||
|
backend: BackendType,
|
||||||
|
config: DatabaseConfig,
|
||||||
|
encryption_key: Option<String>,
|
||||||
|
) -> RpcResult<u64>;
|
||||||
|
|
||||||
|
/// Set encryption for an existing database (write-only key)
|
||||||
|
#[method(name = "setEncryption")]
|
||||||
|
async fn set_encryption(&self, db_id: u64, encryption_key: String) -> RpcResult<bool>;
|
||||||
|
|
||||||
|
/// List all managed databases
|
||||||
|
#[method(name = "listDatabases")]
|
||||||
|
async fn list_databases(&self) -> RpcResult<Vec<DatabaseInfo>>;
|
||||||
|
|
||||||
|
/// Get detailed information about a specific database
|
||||||
|
#[method(name = "getDatabaseInfo")]
|
||||||
|
async fn get_database_info(&self, db_id: u64) -> RpcResult<DatabaseInfo>;
|
||||||
|
|
||||||
|
/// Delete a database
|
||||||
|
#[method(name = "deleteDatabase")]
|
||||||
|
async fn delete_database(&self, db_id: u64) -> RpcResult<bool>;
|
||||||
|
|
||||||
|
/// Get server statistics
|
||||||
|
#[method(name = "getServerStats")]
|
||||||
|
async fn get_server_stats(&self) -> RpcResult<HashMap<String, serde_json::Value>>;
|
||||||
|
|
||||||
|
/// Add an access key to a database
|
||||||
|
#[method(name = "addAccessKey")]
|
||||||
|
async fn add_access_key(&self, db_id: u64, key: String, permissions: String) -> RpcResult<bool>;
|
||||||
|
|
||||||
|
/// Delete an access key from a database
|
||||||
|
#[method(name = "deleteAccessKey")]
|
||||||
|
async fn delete_access_key(&self, db_id: u64, key_hash: String) -> RpcResult<bool>;
|
||||||
|
|
||||||
|
/// List all access keys for a database
|
||||||
|
#[method(name = "listAccessKeys")]
|
||||||
|
async fn list_access_keys(&self, db_id: u64) -> RpcResult<Vec<AccessKeyInfo>>;
|
||||||
|
|
||||||
|
/// Set database public/private status
|
||||||
|
#[method(name = "setDatabasePublic")]
|
||||||
|
async fn set_database_public(&self, db_id: u64, public: bool) -> RpcResult<bool>;
|
||||||
|
}
|
||||||
|
|
||||||
|
/// RPC Server implementation
|
||||||
|
pub struct RpcServerImpl {
|
||||||
|
/// Base directory for database files
|
||||||
|
base_dir: String,
|
||||||
|
/// Managed database servers
|
||||||
|
servers: Arc<RwLock<HashMap<u64, Arc<Server>>>>,
|
||||||
|
/// Next unencrypted database ID to assign
|
||||||
|
next_unencrypted_id: Arc<RwLock<u64>>,
|
||||||
|
/// Next encrypted database ID to assign
|
||||||
|
next_encrypted_id: Arc<RwLock<u64>>,
|
||||||
|
/// Default backend type
|
||||||
|
backend: crate::options::BackendType,
|
||||||
|
/// Encryption keys for databases
|
||||||
|
encryption_keys: Arc<RwLock<HashMap<u64, Option<String>>>>,
|
||||||
|
}
|
||||||
|
|
||||||
|
impl RpcServerImpl {
|
||||||
|
/// Create a new RPC server instance
|
||||||
|
pub fn new(base_dir: String, backend: crate::options::BackendType) -> Self {
|
||||||
|
Self {
|
||||||
|
base_dir,
|
||||||
|
servers: Arc::new(RwLock::new(HashMap::new())),
|
||||||
|
next_unencrypted_id: Arc::new(RwLock::new(0)),
|
||||||
|
next_encrypted_id: Arc::new(RwLock::new(10)),
|
||||||
|
backend,
|
||||||
|
encryption_keys: Arc::new(RwLock::new(HashMap::new())),
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Get or create a server instance for the given database ID
|
||||||
|
async fn get_or_create_server(&self, db_id: u64) -> Result<Arc<Server>, jsonrpsee::types::ErrorObjectOwned> {
|
||||||
|
// Check if server already exists
|
||||||
|
{
|
||||||
|
let servers = self.servers.read().await;
|
||||||
|
if let Some(server) = servers.get(&db_id) {
|
||||||
|
return Ok(server.clone());
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// Check if database file exists
|
||||||
|
let db_path = std::path::PathBuf::from(&self.base_dir).join(format!("{}.db", db_id));
|
||||||
|
if !db_path.exists() {
|
||||||
|
return Err(jsonrpsee::types::ErrorObjectOwned::owned(
|
||||||
|
-32000,
|
||||||
|
format!("Database {} not found", db_id),
|
||||||
|
None::<()>
|
||||||
|
));
|
||||||
|
}
|
||||||
|
|
||||||
|
// Create server instance with default options
|
||||||
|
let db_option = DBOption {
|
||||||
|
dir: self.base_dir.clone(),
|
||||||
|
port: 0, // Not used for RPC-managed databases
|
||||||
|
debug: false,
|
||||||
|
encryption_key: None,
|
||||||
|
encrypt: false,
|
||||||
|
backend: self.backend.clone(),
|
||||||
|
};
|
||||||
|
|
||||||
|
let mut server = Server::new(db_option).await;
|
||||||
|
|
||||||
|
// Set the selected database to the db_id for proper file naming
|
||||||
|
server.selected_db = db_id;
|
||||||
|
|
||||||
|
// Store the server
|
||||||
|
let mut servers = self.servers.write().await;
|
||||||
|
servers.insert(db_id, Arc::new(server.clone()));
|
||||||
|
|
||||||
|
Ok(Arc::new(server))
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Discover existing database files in the base directory
|
||||||
|
async fn discover_databases(&self) -> Vec<u64> {
|
||||||
|
let mut db_ids = Vec::new();
|
||||||
|
|
||||||
|
if let Ok(entries) = std::fs::read_dir(&self.base_dir) {
|
||||||
|
for entry in entries.flatten() {
|
||||||
|
if let Ok(file_name) = entry.file_name().into_string() {
|
||||||
|
// Check if it's a database file (ends with .db)
|
||||||
|
if file_name.ends_with(".db") {
|
||||||
|
// Extract database ID from filename (e.g., "11.db" -> 11)
|
||||||
|
if let Some(id_str) = file_name.strip_suffix(".db") {
|
||||||
|
if let Ok(db_id) = id_str.parse::<u64>() {
|
||||||
|
db_ids.push(db_id);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
db_ids
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Get the next available database ID
|
||||||
|
async fn get_next_db_id(&self, is_encrypted: bool) -> u64 {
|
||||||
|
if is_encrypted {
|
||||||
|
let mut id = self.next_encrypted_id.write().await;
|
||||||
|
let current_id = *id;
|
||||||
|
*id += 1;
|
||||||
|
current_id
|
||||||
|
} else {
|
||||||
|
let mut id = self.next_unencrypted_id.write().await;
|
||||||
|
let current_id = *id;
|
||||||
|
*id += 1;
|
||||||
|
current_id
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Load database metadata from file (static version)
|
||||||
|
pub async fn load_meta_static(base_dir: &str, db_id: u64) -> Result<DatabaseMeta, jsonrpsee::types::ErrorObjectOwned> {
|
||||||
|
let meta_path = std::path::PathBuf::from(base_dir).join(format!("{}_meta.json", db_id));
|
||||||
|
|
||||||
|
// If meta file doesn't exist, return default
|
||||||
|
if !meta_path.exists() {
|
||||||
|
return Ok(DatabaseMeta {
|
||||||
|
public: true,
|
||||||
|
keys: HashMap::new(),
|
||||||
|
});
|
||||||
|
}
|
||||||
|
|
||||||
|
// Read file
|
||||||
|
let content = std::fs::read(&meta_path)
|
||||||
|
.map_err(|e| jsonrpsee::types::ErrorObjectOwned::owned(
|
||||||
|
-32000,
|
||||||
|
format!("Failed to read meta file: {}", e),
|
||||||
|
None::<()>
|
||||||
|
))?;
|
||||||
|
|
||||||
|
let json_str = String::from_utf8(content)
|
||||||
|
.map_err(|_| jsonrpsee::types::ErrorObjectOwned::owned(
|
||||||
|
-32000,
|
||||||
|
"Invalid UTF-8 in meta file",
|
||||||
|
None::<()>
|
||||||
|
))?;
|
||||||
|
|
||||||
|
serde_json::from_str(&json_str)
|
||||||
|
.map_err(|e| jsonrpsee::types::ErrorObjectOwned::owned(
|
||||||
|
-32000,
|
||||||
|
format!("Failed to parse meta JSON: {}", e),
|
||||||
|
None::<()>
|
||||||
|
))
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Load database metadata from file
|
||||||
|
async fn load_meta(&self, db_id: u64) -> Result<DatabaseMeta, jsonrpsee::types::ErrorObjectOwned> {
|
||||||
|
let meta_path = std::path::PathBuf::from(&self.base_dir).join(format!("{}_meta.json", db_id));
|
||||||
|
|
||||||
|
// If meta file doesn't exist, create default
|
||||||
|
if !meta_path.exists() {
|
||||||
|
let default_meta = DatabaseMeta {
|
||||||
|
public: true,
|
||||||
|
keys: HashMap::new(),
|
||||||
|
};
|
||||||
|
self.save_meta(db_id, &default_meta).await?;
|
||||||
|
return Ok(default_meta);
|
||||||
|
}
|
||||||
|
|
||||||
|
// Read and potentially decrypt
|
||||||
|
let content = std::fs::read(&meta_path)
|
||||||
|
.map_err(|e| jsonrpsee::types::ErrorObjectOwned::owned(
|
||||||
|
-32000,
|
||||||
|
format!("Failed to read meta file: {}", e),
|
||||||
|
None::<()>
|
||||||
|
))?;
|
||||||
|
|
||||||
|
let json_str = if db_id >= 10 {
|
||||||
|
// Encrypted database, decrypt meta
|
||||||
|
if let Some(key) = self.encryption_keys.read().await.get(&db_id).and_then(|k| k.as_ref()) {
|
||||||
|
use crate::crypto::CryptoFactory;
|
||||||
|
let crypto = CryptoFactory::new(key.as_bytes());
|
||||||
|
String::from_utf8(crypto.decrypt(&content)
|
||||||
|
.map_err(|_| jsonrpsee::types::ErrorObjectOwned::owned(
|
||||||
|
-32000,
|
||||||
|
"Failed to decrypt meta file",
|
||||||
|
None::<()>
|
||||||
|
))?)
|
||||||
|
.map_err(|_| jsonrpsee::types::ErrorObjectOwned::owned(
|
||||||
|
-32000,
|
||||||
|
"Invalid UTF-8 in decrypted meta",
|
||||||
|
None::<()>
|
||||||
|
))?
|
||||||
|
} else {
|
||||||
|
return Err(jsonrpsee::types::ErrorObjectOwned::owned(
|
||||||
|
-32000,
|
||||||
|
"Encryption key not found for encrypted database",
|
||||||
|
None::<()>
|
||||||
|
));
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
String::from_utf8(content)
|
||||||
|
.map_err(|_| jsonrpsee::types::ErrorObjectOwned::owned(
|
||||||
|
-32000,
|
||||||
|
"Invalid UTF-8 in meta file",
|
||||||
|
None::<()>
|
||||||
|
))?
|
||||||
|
};
|
||||||
|
|
||||||
|
serde_json::from_str(&json_str)
|
||||||
|
.map_err(|e| jsonrpsee::types::ErrorObjectOwned::owned(
|
||||||
|
-32000,
|
||||||
|
format!("Failed to parse meta JSON: {}", e),
|
||||||
|
None::<()>
|
||||||
|
))
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Save database metadata to file (static version)
|
||||||
|
pub async fn save_meta_static(base_dir: &str, db_id: u64, meta: &DatabaseMeta) -> Result<(), jsonrpsee::types::ErrorObjectOwned> {
|
||||||
|
let meta_path = std::path::PathBuf::from(base_dir).join(format!("{}_meta.json", db_id));
|
||||||
|
|
||||||
|
let json_str = serde_json::to_string(meta)
|
||||||
|
.map_err(|e| jsonrpsee::types::ErrorObjectOwned::owned(
|
||||||
|
-32000,
|
||||||
|
format!("Failed to serialize meta: {}", e),
|
||||||
|
None::<()>
|
||||||
|
))?;
|
||||||
|
|
||||||
|
std::fs::write(&meta_path, json_str)
|
||||||
|
.map_err(|e| jsonrpsee::types::ErrorObjectOwned::owned(
|
||||||
|
-32000,
|
||||||
|
format!("Failed to write meta file: {}", e),
|
||||||
|
None::<()>
|
||||||
|
))?;
|
||||||
|
|
||||||
|
Ok(())
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Save database metadata to file
|
||||||
|
async fn save_meta(&self, db_id: u64, meta: &DatabaseMeta) -> Result<(), jsonrpsee::types::ErrorObjectOwned> {
|
||||||
|
let meta_path = std::path::PathBuf::from(&self.base_dir).join(format!("{}_meta.json", db_id));
|
||||||
|
|
||||||
|
let json_str = serde_json::to_string(meta)
|
||||||
|
.map_err(|e| jsonrpsee::types::ErrorObjectOwned::owned(
|
||||||
|
-32000,
|
||||||
|
format!("Failed to serialize meta: {}", e),
|
||||||
|
None::<()>
|
||||||
|
))?;
|
||||||
|
|
||||||
|
if db_id >= 10 {
|
||||||
|
// Encrypted database, encrypt meta
|
||||||
|
if let Some(key) = self.encryption_keys.read().await.get(&db_id).and_then(|k| k.as_ref()) {
|
||||||
|
use crate::crypto::CryptoFactory;
|
||||||
|
let crypto = CryptoFactory::new(key.as_bytes());
|
||||||
|
let encrypted = crypto.encrypt(json_str.as_bytes());
|
||||||
|
std::fs::write(&meta_path, encrypted)
|
||||||
|
.map_err(|e| jsonrpsee::types::ErrorObjectOwned::owned(
|
||||||
|
-32000,
|
||||||
|
format!("Failed to write encrypted meta file: {}", e),
|
||||||
|
None::<()>
|
||||||
|
))?;
|
||||||
|
} else {
|
||||||
|
return Err(jsonrpsee::types::ErrorObjectOwned::owned(
|
||||||
|
-32000,
|
||||||
|
"Encryption key not found for encrypted database",
|
||||||
|
None::<()>
|
||||||
|
));
|
||||||
|
}
|
||||||
|
} else {
|
||||||
|
std::fs::write(&meta_path, json_str)
|
||||||
|
.map_err(|e| jsonrpsee::types::ErrorObjectOwned::owned(
|
||||||
|
-32000,
|
||||||
|
format!("Failed to write meta file: {}", e),
|
||||||
|
None::<()>
|
||||||
|
))?;
|
||||||
|
}
|
||||||
|
|
||||||
|
Ok(())
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
#[jsonrpsee::core::async_trait]
|
||||||
|
impl RpcServer for RpcServerImpl {
|
||||||
|
async fn create_database(
|
||||||
|
&self,
|
||||||
|
backend: BackendType,
|
||||||
|
config: DatabaseConfig,
|
||||||
|
encryption_key: Option<String>,
|
||||||
|
) -> RpcResult<u64> {
|
||||||
|
let db_id = self.get_next_db_id(encryption_key.is_some()).await;
|
||||||
|
|
||||||
|
// Handle both Redb and Sled backends
|
||||||
|
match backend {
|
||||||
|
BackendType::Redb | BackendType::Sled => {
|
||||||
|
// Create database directory
|
||||||
|
let db_dir = if let Some(path) = &config.storage_path {
|
||||||
|
std::path::PathBuf::from(path)
|
||||||
|
} else {
|
||||||
|
std::path::PathBuf::from(&self.base_dir).join(format!("rpc_db_{}", db_id))
|
||||||
|
};
|
||||||
|
|
||||||
|
// Ensure directory exists
|
||||||
|
std::fs::create_dir_all(&db_dir)
|
||||||
|
.map_err(|e| jsonrpsee::types::ErrorObjectOwned::owned(
|
||||||
|
-32000,
|
||||||
|
format!("Failed to create directory: {}", e),
|
||||||
|
None::<()>
|
||||||
|
))?;
|
||||||
|
|
||||||
|
// Create DB options
|
||||||
|
let encrypt = encryption_key.is_some();
|
||||||
|
let option = DBOption {
|
||||||
|
dir: db_dir.to_string_lossy().to_string(),
|
||||||
|
port: 0, // Not used for RPC-managed databases
|
||||||
|
debug: false,
|
||||||
|
encryption_key: encryption_key.clone(),
|
||||||
|
encrypt,
|
||||||
|
backend: match backend {
|
||||||
|
BackendType::Redb => crate::options::BackendType::Redb,
|
||||||
|
BackendType::Sled => crate::options::BackendType::Sled,
|
||||||
|
},
|
||||||
|
};
|
||||||
|
|
||||||
|
// Create server instance
|
||||||
|
let mut server = Server::new(option).await;
|
||||||
|
|
||||||
|
// Set the selected database to the db_id for proper file naming
|
||||||
|
server.selected_db = db_id;
|
||||||
|
|
||||||
|
// Initialize the storage to create the database file
|
||||||
|
let _ = server.current_storage();
|
||||||
|
|
||||||
|
// Store the encryption key
|
||||||
|
{
|
||||||
|
let mut keys = self.encryption_keys.write().await;
|
||||||
|
keys.insert(db_id, encryption_key.clone());
|
||||||
|
}
|
||||||
|
|
||||||
|
// Initialize meta file
|
||||||
|
let meta = DatabaseMeta {
|
||||||
|
public: true,
|
||||||
|
keys: HashMap::new(),
|
||||||
|
};
|
||||||
|
self.save_meta(db_id, &meta).await?;
|
||||||
|
|
||||||
|
// Store the server
|
||||||
|
let mut servers = self.servers.write().await;
|
||||||
|
servers.insert(db_id, Arc::new(server));
|
||||||
|
|
||||||
|
Ok(db_id)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
async fn set_encryption(&self, db_id: u64, _encryption_key: String) -> RpcResult<bool> {
|
||||||
|
// Note: In a real implementation, we'd need to modify the existing database
|
||||||
|
// For now, return false as encryption can only be set during creation
|
||||||
|
let _servers = self.servers.read().await;
|
||||||
|
// TODO: Implement encryption setting for existing databases
|
||||||
|
Ok(false)
|
||||||
|
}
|
||||||
|
|
||||||
|
async fn list_databases(&self) -> RpcResult<Vec<DatabaseInfo>> {
|
||||||
|
let db_ids = self.discover_databases().await;
|
||||||
|
let mut result = Vec::new();
|
||||||
|
|
||||||
|
for db_id in db_ids {
|
||||||
|
// Try to get or create server for this database
|
||||||
|
if let Ok(server) = self.get_or_create_server(db_id).await {
|
||||||
|
let backend = match server.option.backend {
|
||||||
|
crate::options::BackendType::Redb => BackendType::Redb,
|
||||||
|
crate::options::BackendType::Sled => BackendType::Sled,
|
||||||
|
};
|
||||||
|
|
||||||
|
let info = DatabaseInfo {
|
||||||
|
id: db_id,
|
||||||
|
name: None, // TODO: Store name in server metadata
|
||||||
|
backend,
|
||||||
|
encrypted: server.option.encrypt,
|
||||||
|
redis_version: Some("7.0".to_string()), // Default Redis compatibility
|
||||||
|
storage_path: Some(server.option.dir.clone()),
|
||||||
|
size_on_disk: None, // TODO: Calculate actual size
|
||||||
|
key_count: None, // TODO: Get key count from storage
|
||||||
|
created_at: std::time::SystemTime::now()
|
||||||
|
.duration_since(std::time::UNIX_EPOCH)
|
||||||
|
.unwrap()
|
||||||
|
.as_secs(),
|
||||||
|
last_access: None,
|
||||||
|
};
|
||||||
|
result.push(info);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
Ok(result)
|
||||||
|
}
|
||||||
|
|
||||||
|
async fn get_database_info(&self, db_id: u64) -> RpcResult<DatabaseInfo> {
|
||||||
|
let server = self.get_or_create_server(db_id).await?;
|
||||||
|
|
||||||
|
let backend = match server.option.backend {
|
||||||
|
crate::options::BackendType::Redb => BackendType::Redb,
|
||||||
|
crate::options::BackendType::Sled => BackendType::Sled,
|
||||||
|
};
|
||||||
|
|
||||||
|
Ok(DatabaseInfo {
|
||||||
|
id: db_id,
|
||||||
|
name: None,
|
||||||
|
backend,
|
||||||
|
encrypted: server.option.encrypt,
|
||||||
|
redis_version: Some("7.0".to_string()),
|
||||||
|
storage_path: Some(server.option.dir.clone()),
|
||||||
|
size_on_disk: None,
|
||||||
|
key_count: None,
|
||||||
|
created_at: std::time::SystemTime::now()
|
||||||
|
.duration_since(std::time::UNIX_EPOCH)
|
||||||
|
.unwrap()
|
||||||
|
.as_secs(),
|
||||||
|
last_access: None,
|
||||||
|
})
|
||||||
|
}
|
||||||
|
|
||||||
|
async fn delete_database(&self, db_id: u64) -> RpcResult<bool> {
|
||||||
|
let mut servers = self.servers.write().await;
|
||||||
|
|
||||||
|
if let Some(_server) = servers.remove(&db_id) {
|
||||||
|
// Clean up database files
|
||||||
|
let db_path = std::path::PathBuf::from(&self.base_dir).join(format!("{}.db", db_id));
|
||||||
|
if db_path.exists() {
|
||||||
|
if db_path.is_dir() {
|
||||||
|
std::fs::remove_dir_all(&db_path).ok();
|
||||||
|
} else {
|
||||||
|
std::fs::remove_file(&db_path).ok();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
Ok(true)
|
||||||
|
} else {
|
||||||
|
Ok(false)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
async fn get_server_stats(&self) -> RpcResult<HashMap<String, serde_json::Value>> {
|
||||||
|
let db_ids = self.discover_databases().await;
|
||||||
|
let mut stats = HashMap::new();
|
||||||
|
|
||||||
|
stats.insert("total_databases".to_string(), serde_json::json!(db_ids.len()));
|
||||||
|
stats.insert("uptime".to_string(), serde_json::json!(
|
||||||
|
std::time::SystemTime::now()
|
||||||
|
.duration_since(std::time::UNIX_EPOCH)
|
||||||
|
.unwrap()
|
||||||
|
.as_secs()
|
||||||
|
));
|
||||||
|
|
||||||
|
Ok(stats)
|
||||||
|
}
|
||||||
|
|
||||||
|
async fn add_access_key(&self, db_id: u64, key: String, permissions: String) -> RpcResult<bool> {
|
||||||
|
let mut meta = self.load_meta(db_id).await?;
|
||||||
|
|
||||||
|
let perms = match permissions.to_lowercase().as_str() {
|
||||||
|
"read" => Permissions::Read,
|
||||||
|
"readwrite" => Permissions::ReadWrite,
|
||||||
|
_ => return Err(jsonrpsee::types::ErrorObjectOwned::owned(
|
||||||
|
-32000,
|
||||||
|
"Invalid permissions: use 'read' or 'readwrite'",
|
||||||
|
None::<()>
|
||||||
|
)),
|
||||||
|
};
|
||||||
|
|
||||||
|
let hash = hash_key(&key);
|
||||||
|
let access_key = AccessKey {
|
||||||
|
hash: hash.clone(),
|
||||||
|
permissions: perms,
|
||||||
|
created_at: std::time::SystemTime::now()
|
||||||
|
.duration_since(std::time::UNIX_EPOCH)
|
||||||
|
.unwrap()
|
||||||
|
.as_secs(),
|
||||||
|
};
|
||||||
|
|
||||||
|
meta.keys.insert(hash, access_key);
|
||||||
|
self.save_meta(db_id, &meta).await?;
|
||||||
|
Ok(true)
|
||||||
|
}
|
||||||
|
|
||||||
|
async fn delete_access_key(&self, db_id: u64, key_hash: String) -> RpcResult<bool> {
|
||||||
|
let mut meta = self.load_meta(db_id).await?;
|
||||||
|
|
||||||
|
if meta.keys.remove(&key_hash).is_some() {
|
||||||
|
// If no keys left, make database public
|
||||||
|
if meta.keys.is_empty() {
|
||||||
|
meta.public = true;
|
||||||
|
}
|
||||||
|
self.save_meta(db_id, &meta).await?;
|
||||||
|
Ok(true)
|
||||||
|
} else {
|
||||||
|
Ok(false)
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
async fn list_access_keys(&self, db_id: u64) -> RpcResult<Vec<AccessKeyInfo>> {
|
||||||
|
let meta = self.load_meta(db_id).await?;
|
||||||
|
let keys: Vec<AccessKeyInfo> = meta.keys.values()
|
||||||
|
.map(|k| AccessKeyInfo {
|
||||||
|
hash: k.hash.clone(),
|
||||||
|
permissions: k.permissions.clone(),
|
||||||
|
created_at: k.created_at,
|
||||||
|
})
|
||||||
|
.collect();
|
||||||
|
Ok(keys)
|
||||||
|
}
|
||||||
|
|
||||||
|
async fn set_database_public(&self, db_id: u64, public: bool) -> RpcResult<bool> {
|
||||||
|
let mut meta = self.load_meta(db_id).await?;
|
||||||
|
meta.public = public;
|
||||||
|
self.save_meta(db_id, &meta).await?;
|
||||||
|
Ok(true)
|
||||||
|
}
|
||||||
|
}
|
49
src/rpc_server.rs
Normal file
49
src/rpc_server.rs
Normal file
@@ -0,0 +1,49 @@
|
|||||||
|
use std::net::SocketAddr;
|
||||||
|
use jsonrpsee::server::{ServerBuilder, ServerHandle};
|
||||||
|
use jsonrpsee::RpcModule;
|
||||||
|
|
||||||
|
use crate::rpc::{RpcServer, RpcServerImpl};
|
||||||
|
|
||||||
|
/// Start the RPC server on the specified address
|
||||||
|
pub async fn start_rpc_server(addr: SocketAddr, base_dir: String, backend: crate::options::BackendType) -> Result<ServerHandle, Box<dyn std::error::Error + Send + Sync>> {
|
||||||
|
// Create the RPC server implementation
|
||||||
|
let rpc_impl = RpcServerImpl::new(base_dir, backend);
|
||||||
|
|
||||||
|
// Create the RPC module
|
||||||
|
let mut module = RpcModule::new(());
|
||||||
|
module.merge(RpcServer::into_rpc(rpc_impl))?;
|
||||||
|
|
||||||
|
// Build the server with both HTTP and WebSocket support
|
||||||
|
let server = ServerBuilder::default()
|
||||||
|
.build(addr)
|
||||||
|
.await?;
|
||||||
|
|
||||||
|
// Start the server
|
||||||
|
let handle = server.start(module);
|
||||||
|
|
||||||
|
println!("RPC server started on {}", addr);
|
||||||
|
|
||||||
|
Ok(handle)
|
||||||
|
}
|
||||||
|
|
||||||
|
#[cfg(test)]
|
||||||
|
mod tests {
|
||||||
|
use super::*;
|
||||||
|
use std::time::Duration;
|
||||||
|
|
||||||
|
#[tokio::test]
|
||||||
|
async fn test_rpc_server_startup() {
|
||||||
|
let addr = "127.0.0.1:0".parse().unwrap(); // Use port 0 for auto-assignment
|
||||||
|
let base_dir = "/tmp/test_rpc".to_string();
|
||||||
|
let backend = crate::options::BackendType::Redb; // Default for test
|
||||||
|
|
||||||
|
let handle = start_rpc_server(addr, base_dir, backend).await.unwrap();
|
||||||
|
|
||||||
|
// Give the server a moment to start
|
||||||
|
tokio::time::sleep(Duration::from_millis(100)).await;
|
||||||
|
|
||||||
|
// Stop the server
|
||||||
|
handle.stop().unwrap();
|
||||||
|
handle.stopped().await;
|
||||||
|
}
|
||||||
|
}
|
@@ -9,7 +9,6 @@ use std::sync::atomic::{AtomicU64, Ordering};
|
|||||||
|
|
||||||
use crate::cmd::Cmd;
|
use crate::cmd::Cmd;
|
||||||
use crate::error::DBError;
|
use crate::error::DBError;
|
||||||
use crate::lance_store::LanceStore;
|
|
||||||
use crate::options;
|
use crate::options;
|
||||||
use crate::protocol::Protocol;
|
use crate::protocol::Protocol;
|
||||||
use crate::storage::Storage;
|
use crate::storage::Storage;
|
||||||
@@ -23,13 +22,11 @@ pub struct Server {
|
|||||||
pub client_name: Option<String>,
|
pub client_name: Option<String>,
|
||||||
pub selected_db: u64, // Changed from usize to u64
|
pub selected_db: u64, // Changed from usize to u64
|
||||||
pub queued_cmd: Option<Vec<(Cmd, Protocol)>>,
|
pub queued_cmd: Option<Vec<(Cmd, Protocol)>>,
|
||||||
|
pub current_permissions: Option<crate::rpc::Permissions>,
|
||||||
|
|
||||||
// BLPOP waiter registry: per (db_index, key) FIFO of waiters
|
// BLPOP waiter registry: per (db_index, key) FIFO of waiters
|
||||||
pub list_waiters: Arc<Mutex<HashMap<u64, HashMap<String, Vec<Waiter>>>>>,
|
pub list_waiters: Arc<Mutex<HashMap<u64, HashMap<String, Vec<Waiter>>>>>,
|
||||||
pub waiter_seq: Arc<AtomicU64>,
|
pub waiter_seq: Arc<AtomicU64>,
|
||||||
|
|
||||||
// Lance vector store
|
|
||||||
pub lance_store: Option<Arc<LanceStore>>,
|
|
||||||
}
|
}
|
||||||
|
|
||||||
pub struct Waiter {
|
pub struct Waiter {
|
||||||
@@ -46,36 +43,19 @@ pub enum PopSide {
|
|||||||
|
|
||||||
impl Server {
|
impl Server {
|
||||||
pub async fn new(option: options::DBOption) -> Self {
|
pub async fn new(option: options::DBOption) -> Self {
|
||||||
// Initialize Lance store
|
|
||||||
let lance_data_dir = std::path::PathBuf::from(&option.dir).join("lance");
|
|
||||||
let lance_store = match LanceStore::new(lance_data_dir).await {
|
|
||||||
Ok(store) => Some(Arc::new(store)),
|
|
||||||
Err(e) => {
|
|
||||||
eprintln!("Warning: Failed to initialize Lance store: {}", e.0);
|
|
||||||
None
|
|
||||||
}
|
|
||||||
};
|
|
||||||
|
|
||||||
Server {
|
Server {
|
||||||
db_cache: Arc::new(std::sync::RwLock::new(HashMap::new())),
|
db_cache: Arc::new(std::sync::RwLock::new(HashMap::new())),
|
||||||
option,
|
option,
|
||||||
client_name: None,
|
client_name: None,
|
||||||
selected_db: 0,
|
selected_db: 0,
|
||||||
queued_cmd: None,
|
queued_cmd: None,
|
||||||
|
current_permissions: None,
|
||||||
|
|
||||||
list_waiters: Arc::new(Mutex::new(HashMap::new())),
|
list_waiters: Arc::new(Mutex::new(HashMap::new())),
|
||||||
waiter_seq: Arc::new(AtomicU64::new(1)),
|
waiter_seq: Arc::new(AtomicU64::new(1)),
|
||||||
lance_store,
|
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
pub fn lance_store(&self) -> Result<Arc<LanceStore>, DBError> {
|
|
||||||
self.lance_store
|
|
||||||
.as_ref()
|
|
||||||
.cloned()
|
|
||||||
.ok_or_else(|| DBError("Lance store not initialized".to_string()))
|
|
||||||
}
|
|
||||||
|
|
||||||
pub fn current_storage(&self) -> Result<Arc<dyn StorageBackend>, DBError> {
|
pub fn current_storage(&self) -> Result<Arc<dyn StorageBackend>, DBError> {
|
||||||
let mut cache = self.db_cache.write().unwrap();
|
let mut cache = self.db_cache.write().unwrap();
|
||||||
|
|
||||||
@@ -123,6 +103,16 @@ impl Server {
|
|||||||
self.option.encrypt && db_index >= 10
|
self.option.encrypt && db_index >= 10
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// Check if current permissions allow read operations
|
||||||
|
pub fn has_read_permission(&self) -> bool {
|
||||||
|
matches!(self.current_permissions, Some(crate::rpc::Permissions::Read) | Some(crate::rpc::Permissions::ReadWrite))
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Check if current permissions allow write operations
|
||||||
|
pub fn has_write_permission(&self) -> bool {
|
||||||
|
matches!(self.current_permissions, Some(crate::rpc::Permissions::ReadWrite))
|
||||||
|
}
|
||||||
|
|
||||||
// ----- BLPOP waiter helpers -----
|
// ----- BLPOP waiter helpers -----
|
||||||
|
|
||||||
pub async fn register_waiter(&self, db_index: u64, key: &str, side: PopSide) -> (u64, oneshot::Receiver<(String, String)>) {
|
pub async fn register_waiter(&self, db_index: u64, key: &str, side: PopSide) -> (u64, oneshot::Receiver<(String, String)>) {
|
||||||
|
62
tests/rpc_tests.rs
Normal file
62
tests/rpc_tests.rs
Normal file
@@ -0,0 +1,62 @@
|
|||||||
|
use std::net::SocketAddr;
|
||||||
|
use jsonrpsee::http_client::HttpClientBuilder;
|
||||||
|
use jsonrpsee::core::client::ClientT;
|
||||||
|
use serde_json::json;
|
||||||
|
|
||||||
|
use herodb::rpc::{RpcClient, BackendType, DatabaseConfig};
|
||||||
|
|
||||||
|
#[tokio::test]
|
||||||
|
async fn test_rpc_server_basic() {
|
||||||
|
// This test would require starting the RPC server in a separate thread
|
||||||
|
// For now, we'll just test that the types compile correctly
|
||||||
|
|
||||||
|
// Test serialization of types
|
||||||
|
let backend = BackendType::Redb;
|
||||||
|
let config = DatabaseConfig {
|
||||||
|
name: Some("test_db".to_string()),
|
||||||
|
storage_path: Some("/tmp/test".to_string()),
|
||||||
|
max_size: Some(1024 * 1024),
|
||||||
|
redis_version: Some("7.0".to_string()),
|
||||||
|
};
|
||||||
|
|
||||||
|
let backend_json = serde_json::to_string(&backend).unwrap();
|
||||||
|
let config_json = serde_json::to_string(&config).unwrap();
|
||||||
|
|
||||||
|
assert_eq!(backend_json, "\"Redb\"");
|
||||||
|
assert!(config_json.contains("test_db"));
|
||||||
|
}
|
||||||
|
|
||||||
|
#[tokio::test]
|
||||||
|
async fn test_database_config_serialization() {
|
||||||
|
let config = DatabaseConfig {
|
||||||
|
name: Some("my_db".to_string()),
|
||||||
|
storage_path: None,
|
||||||
|
max_size: Some(1000000),
|
||||||
|
redis_version: Some("7.0".to_string()),
|
||||||
|
};
|
||||||
|
|
||||||
|
let json = serde_json::to_value(&config).unwrap();
|
||||||
|
assert_eq!(json["name"], "my_db");
|
||||||
|
assert_eq!(json["max_size"], 1000000);
|
||||||
|
assert_eq!(json["redis_version"], "7.0");
|
||||||
|
}
|
||||||
|
|
||||||
|
#[tokio::test]
|
||||||
|
async fn test_backend_type_serialization() {
|
||||||
|
// Test that both Redb and Sled backends serialize correctly
|
||||||
|
let redb_backend = BackendType::Redb;
|
||||||
|
let sled_backend = BackendType::Sled;
|
||||||
|
|
||||||
|
let redb_json = serde_json::to_string(&redb_backend).unwrap();
|
||||||
|
let sled_json = serde_json::to_string(&sled_backend).unwrap();
|
||||||
|
|
||||||
|
assert_eq!(redb_json, "\"Redb\"");
|
||||||
|
assert_eq!(sled_json, "\"Sled\"");
|
||||||
|
|
||||||
|
// Test deserialization
|
||||||
|
let redb_deserialized: BackendType = serde_json::from_str(&redb_json).unwrap();
|
||||||
|
let sled_deserialized: BackendType = serde_json::from_str(&sled_json).unwrap();
|
||||||
|
|
||||||
|
assert!(matches!(redb_deserialized, BackendType::Redb));
|
||||||
|
assert!(matches!(sled_deserialized, BackendType::Sled));
|
||||||
|
}
|
@@ -501,11 +501,11 @@ async fn test_07_age_stateless_suite() {
|
|||||||
let mut s = connect(port).await;
|
let mut s = connect(port).await;
|
||||||
|
|
||||||
// GENENC -> [recipient, identity]
|
// GENENC -> [recipient, identity]
|
||||||
let gen = send_cmd(&mut s, &["AGE", "GENENC"]).await;
|
let genenc = send_cmd(&mut s, &["AGE", "GENENC"]).await;
|
||||||
assert!(
|
assert!(
|
||||||
gen.starts_with("*2\r\n$"),
|
genenc.starts_with("*2\r\n$"),
|
||||||
"AGE GENENC should return array [recipient, identity], got:\n{}",
|
"AGE GENENC should return array [recipient, identity], got:\n{}",
|
||||||
gen
|
genenc
|
||||||
);
|
);
|
||||||
|
|
||||||
// Parse simple RESP array of two bulk strings to extract keys
|
// Parse simple RESP array of two bulk strings to extract keys
|
||||||
@@ -520,7 +520,7 @@ async fn test_07_age_stateless_suite() {
|
|||||||
let ident = lines.next().unwrap_or("").to_string();
|
let ident = lines.next().unwrap_or("").to_string();
|
||||||
(recip, ident)
|
(recip, ident)
|
||||||
}
|
}
|
||||||
let (recipient, identity) = parse_two_bulk_array(&gen);
|
let (recipient, identity) = parse_two_bulk_array(&genenc);
|
||||||
assert!(
|
assert!(
|
||||||
recipient.starts_with("age1") && identity.starts_with("AGE-SECRET-KEY-1"),
|
recipient.starts_with("age1") && identity.starts_with("AGE-SECRET-KEY-1"),
|
||||||
"Unexpected AGE key formats.\nrecipient: {}\nidentity: {}",
|
"Unexpected AGE key formats.\nrecipient: {}\nidentity: {}",
|
||||||
|
Reference in New Issue
Block a user