134 lines
		
	
	
		
			4.3 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
			
		
		
	
	
			134 lines
		
	
	
		
			4.3 KiB
		
	
	
	
		
			Markdown
		
	
	
	
	
	
# LanceDB Text and Images: End-to-End Example
 | 
						|
 | 
						|
This guide demonstrates creating a Lance backend database, ingesting two text documents and two images, performing searches over both, and cleaning up the datasets.
 | 
						|
 | 
						|
Prerequisites
 | 
						|
- Build HeroDB and start the server with JSON-RPC enabled.
 | 
						|
Commands:
 | 
						|
```bash
 | 
						|
cargo build --release
 | 
						|
./target/release/herodb --dir /tmp/herodb --admin-secret mysecret --port 6379 --enable-rpc
 | 
						|
```
 | 
						|
 | 
						|
We'll use:
 | 
						|
- redis-cli for RESP commands against port 6379
 | 
						|
- curl for JSON-RPC against 8080 if desired
 | 
						|
- Deterministic local embedders to avoid external dependencies: testhash (text, dim 64) and testimagehash (image, dim 512)
 | 
						|
 | 
						|
0) Create a Lance-backed database (JSON-RPC)
 | 
						|
Request:
 | 
						|
```json
 | 
						|
{ "jsonrpc": "2.0", "id": 1, "method": "herodb_createDatabase", "params": ["Lance", { "name": "media-db", "storage_path": null, "max_size": null, "redis_version": null }, null] }
 | 
						|
```
 | 
						|
Response returns db_id (assume 1). Select DB over RESP:
 | 
						|
```bash
 | 
						|
redis-cli -p 6379 SELECT 1
 | 
						|
# → OK
 | 
						|
```
 | 
						|
 | 
						|
1) Configure embedding providers
 | 
						|
We'll create two datasets with independent embedding configs:
 | 
						|
- textset → provider testhash, dim 64
 | 
						|
- imageset → provider testimagehash, dim 512
 | 
						|
 | 
						|
Text config:
 | 
						|
```bash
 | 
						|
redis-cli -p 6379 LANCE.EMBEDDING CONFIG SET textset PROVIDER testhash MODEL any PARAM dim 64
 | 
						|
# → OK
 | 
						|
```
 | 
						|
Image config:
 | 
						|
```bash
 | 
						|
redis-cli -p 6379 LANCE.EMBEDDING CONFIG SET imageset PROVIDER testimagehash MODEL any PARAM dim 512
 | 
						|
# → OK
 | 
						|
```
 | 
						|
 | 
						|
2) Create datasets
 | 
						|
```bash
 | 
						|
redis-cli -p 6379 LANCE.CREATE textset DIM 64
 | 
						|
# → OK
 | 
						|
redis-cli -p 6379 LANCE.CREATE imageset DIM 512
 | 
						|
# → OK
 | 
						|
```
 | 
						|
 | 
						|
3) Ingest two text documents (server-side embedding)
 | 
						|
```bash
 | 
						|
redis-cli -p 6379 LANCE.STORE textset ID doc-1 TEXT "The quick brown fox jumps over the lazy dog" META title "Fox" category "animal"
 | 
						|
# → OK
 | 
						|
redis-cli -p 6379 LANCE.STORE textset ID doc-2 TEXT "A fast auburn fox vaulted a sleepy canine" META title "Paraphrase" category "animal"
 | 
						|
# → OK
 | 
						|
```
 | 
						|
 | 
						|
4) Ingest two images
 | 
						|
You can provide a URI or base64 bytes. Use URI for URIs, BYTES for base64 data.
 | 
						|
Example using free placeholder images:
 | 
						|
```bash
 | 
						|
# Store via URI
 | 
						|
redis-cli -p 6379 LANCE.STOREIMAGE imageset ID img-1 URI "https://picsum.photos/seed/1/256/256" META title "Seed1" group "demo"
 | 
						|
# → OK
 | 
						|
redis-cli -p 6379 LANCE.STOREIMAGE imageset ID img-2 URI "https://picsum.photos/seed/2/256/256" META title "Seed2" group "demo"
 | 
						|
# → OK
 | 
						|
```
 | 
						|
If your environment blocks outbound HTTP, you can embed image bytes:
 | 
						|
```bash
 | 
						|
# Example: read a local file and base64 it (replace path)
 | 
						|
b64=$(base64 -w0 ./image1.png)
 | 
						|
redis-cli -p 6379 LANCE.STOREIMAGE imageset ID img-b64-1 BYTES "$b64" META title "Local1" group "demo"
 | 
						|
```
 | 
						|
 | 
						|
5) Search text
 | 
						|
```bash
 | 
						|
# Top-2 nearest neighbors for a query
 | 
						|
redis-cli -p 6379 LANCE.SEARCH textset K 2 QUERY "quick brown fox" RETURN 1 title
 | 
						|
# → 1) [id, score, [k1,v1,...]]
 | 
						|
```
 | 
						|
With a filter (supports equality on schema or meta keys):
 | 
						|
```bash
 | 
						|
redis-cli -p 6379 LANCE.SEARCH textset K 2 QUERY "fox jumps" FILTER "category = 'animal'" RETURN 1 title
 | 
						|
```
 | 
						|
 | 
						|
6) Search images
 | 
						|
```bash
 | 
						|
# Provide a URI as the query
 | 
						|
redis-cli -p 6379 LANCE.SEARCHIMAGE imageset K 2 QUERYURI "https://picsum.photos/seed/1/256/256" RETURN 1 title
 | 
						|
 | 
						|
# Or provide base64 bytes as the query
 | 
						|
qb64=$(curl -s https://picsum.photos/seed/3/256/256 | base64 -w0)
 | 
						|
redis-cli -p 6379 LANCE.SEARCHIMAGE imageset K 2 QUERYBYTES "$qb64" RETURN 1 title
 | 
						|
```
 | 
						|
 | 
						|
7) Inspect datasets
 | 
						|
```bash
 | 
						|
redis-cli -p 6379 LANCE.LIST
 | 
						|
redis-cli -p 6379 LANCE.INFO textset
 | 
						|
redis-cli -p 6379 LANCE.INFO imageset
 | 
						|
```
 | 
						|
 | 
						|
8) Delete by id and drop datasets
 | 
						|
```bash
 | 
						|
# Delete one record
 | 
						|
redis-cli -p 6379 LANCE.DEL textset doc-2
 | 
						|
# → OK
 | 
						|
 | 
						|
# Drop entire datasets
 | 
						|
redis-cli -p 6379 LANCE.DROP textset
 | 
						|
redis-cli -p 6379 LANCE.DROP imageset
 | 
						|
# → OK
 | 
						|
```
 | 
						|
 | 
						|
Appendix: Using OpenAI embeddings instead of test providers
 | 
						|
Text:
 | 
						|
```bash
 | 
						|
export OPENAI_API_KEY=sk-...
 | 
						|
redis-cli -p 6379 LANCE.EMBEDDING CONFIG SET textset PROVIDER openai MODEL text-embedding-3-small PARAM dim 512
 | 
						|
redis-cli -p 6379 LANCE.CREATE textset DIM 512
 | 
						|
```
 | 
						|
Custom OpenAI-compatible endpoint:
 | 
						|
```bash
 | 
						|
redis-cli -p 6379 LANCE.EMBEDDING CONFIG SET textset PROVIDER openai MODEL text-embedding-3-small \
 | 
						|
  PARAM endpoint http://localhost:8081/v1/embeddings \
 | 
						|
  PARAM dim 512
 | 
						|
```
 | 
						|
Notes:
 | 
						|
- Ensure dataset DIM matches the configured embedding dimension.
 | 
						|
- Lance is only available for non-admin databases (db_id >= 1).
 | 
						|
- On Lance DBs, only LANCE.* and basic control commands are allowed. |