Files
herodb/docs/tantivy.md

7.0 KiB
Raw Blame History

Tantivy FullText Backend (JSONRPC)

This document explains how to use HeroDBs Tantivy-backed fulltext search as a dedicated database backend and provides copypasteable JSONRPC requests. Tantivy is available only for nonadmin databases (db_id >= 1). Admin DB 0 always uses Redb/Sled and rejects FT operations.

Important characteristics:

  • Tantivy is a third backend alongside Redb and Sled. It provides search indexes only; there is no KV store backing it.
  • On Tantivy databases, Redis KV/list/hash commands are rejected; only FT commands and basic control (SELECT, CLIENT, INFO, etc.) are allowed.
  • FT JSONRPC is namespaced as "herodb" and methods are named with underscore: herodb_ftCreate, herodb_ftAdd, herodb_ftSearch, herodb_ftDel, herodb_ftInfo, herodb_ftDrop.

Reference to server implementation:

Notes on responses:

  • ftCreate/ftAdd/ftDel/ftDrop return a JSON boolean: true on success.
  • ftSearch/ftInfo return a JSON object with a single key "resp" containing a RESPencoded string (wire format used by Redis). You can display or parse it on the client side as needed.

RESP usage (redis-cli):

  • For RESP clients, you must SELECT the Tantivy database first. SELECT now succeeds for Tantivy DBs without opening KV storage.
  • After SELECT, you can run FT.* commands within that DB context.

Example with redis-cli:

# Connect to server
redis-cli -p 6379

# Select Tantivy DB 1 (public by default)
SELECT 1
# → OK

# Create index
FT.CREATE product_catalog SCHEMA title TEXT description TEXT category TAG price NUMERIC rating NUMERIC location GEO
# → OK

# Add a document
FT.ADD product_catalog product:1 1.0 title "Wireless Bluetooth Headphones" description "Premium noise-canceling headphones with 30-hour battery life" category "electronics,audio" price 299.99 rating 4.5 location "-122.4194,37.7749"
# → OK

# Search
FT.SEARCH product_catalog wireless LIMIT 0 3
# → RESP array with hits

Storage layout (on disk):

  • Indices are stored per database under:
    • <base_dir>/search_indexes/<db_id>/<index_name>
  • Example: /tmp/test/search_indexes/1/product_catalog
  1. Create a new Tantivy database

Use herodb_createDatabase with backend "Tantivy". DB 0 cannot be Tantivy.

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "herodb_createDatabase",
  "params": [
    "Tantivy",
    { "name": "search-db", "storage_path": null, "max_size": null, "redis_version": null },
    null
  ]
}

The response contains the allocated db_id (>= 1). Use that id in the calls below.

  1. FT.CREATE — create an index with schema

Method: herodb_ftCreate → rust.fn ft_create()

Schema format is an array of tuples: [ [field_name, field_type, [options...] ], ... ] Supported field types: "TEXT", "NUMERIC" (defaults to F64), "TAG", "GEO" Supported options (subset): "WEIGHT", "SORTABLE", "NOINDEX", "SEPARATOR", "CASESENSITIVE"

{
  "jsonrpc": "2.0",
  "id": 2,
  "method": "herodb_ftCreate",
  "params": [
    1,
    "product_catalog",
    [
      ["title", "TEXT", ["SORTABLE"]],
      ["description", "TEXT", []],
      ["category", "TAG", ["SEPARATOR", ","]],
      ["price", "NUMERIC", ["SORTABLE"]],
      ["rating", "NUMERIC", []],
      ["location", "GEO", []]
    ]
  ]
}

Returns: true on success.

  1. FT.ADD — add or replace a document

Method: herodb_ftAdd → rust.fn ft_add()

Fields is an object (map) of field_name → value (all values are sent as strings). GEO expects "lat,lon".

{
  "jsonrpc": "2.0",
  "id": 3,
  "method": "herodb_ftAdd",
  "params": [
    1,
    "product_catalog",
    "product:1",
    1.0,
    {
      "title": "Wireless Bluetooth Headphones",
      "description": "Premium noise-canceling headphones with 30-hour battery life",
      "category": "electronics,audio",
      "price": "299.99",
      "rating": "4.5",
      "location": "-122.4194,37.7749"
    }
  ]
}

Returns: true on success.

  1. FT.SEARCH — query an index

Method: herodb_ftSearch → rust.fn ft_search()

Parameters: (db_id, index_name, query, filters?, limit?, offset?, return_fields?)

  • filters: array of [field, value] pairs (Equals filter)
  • limit/offset: numbers (defaults: limit=10, offset=0)
  • return_fields: array of field names to include (optional)

Simple query:

{
  "jsonrpc": "2.0",
  "id": 4,
  "method": "herodb_ftSearch",
  "params": [1, "product_catalog", "wireless", null, 10, 0, null]
}

Pagination + filters + selected fields:

{
  "jsonrpc": "2.0",
  "id": 5,
  "method": "herodb_ftSearch",
  "params": [
    1,
    "product_catalog",
    "mouse",
    [["category", "electronics"]],
    5,
    0,
    ["title", "price", "rating"]
  ]
}

Response shape:

{
  "jsonrpc": "2.0",
  "id": 5,
  "result": { "resp": "*...RESP encoded array..." }
}
  1. FT.INFO — index metadata

Method: herodb_ftInfo → rust.fn ft_info()

{
  "jsonrpc": "2.0",
  "id": 6,
  "method": "herodb_ftInfo",
  "params": [1, "product_catalog"]
}

Response shape:

{
  "jsonrpc": "2.0",
  "id": 6,
  "result": { "resp": "*...RESP encoded array with fields and counts..." }
}
  1. FT.DEL — delete by doc id

Method: herodb_ftDel → rust.fn ft_del()

{
  "jsonrpc": "2.0",
  "id": 7,
  "method": "herodb_ftDel",
  "params": [1, "product_catalog", "product:1"]
}

Returns: true on success. Note: current implementation logs and returns success; physical delete may be a noop until delete is finalized in the engine.

  1. FT.DROP — drop an index

Method: herodb_ftDrop → rust.fn ft_drop()

{
  "jsonrpc": "2.0",
  "id": 8,
  "method": "herodb_ftDrop",
  "params": [1, "product_catalog"]
}

Returns: true on success.

Field types and options

  • TEXT: stored/indexed/tokenized text. "SORTABLE" marks it fast (stored + fast path in our wrapper).
  • NUMERIC: stored/indexed numeric; default precision F64. "SORTABLE" enables fast column.
  • TAG: exact matching terms. Options: "SEPARATOR" (default ","), "CASESENSITIVE" (default false).
  • GEO: "lat,lon" string; stored as two numeric fields internally.

Backend and permission gating

  • FT methods are rejected on DB 0.
  • FT methods require the database backend to be Tantivy; otherwise RPC returns an error.
  • Writelike FT methods (create/add/del/drop) follow the same permission model as Redis writes on selected databases.

Troubleshooting

  • "DB backend is not Tantivy": ensure the database was created with backend "Tantivy".
  • "FT not allowed on DB 0": use a nonadmin database id (>= 1).
  • Empty search results: confirm that the queried fields are tokenized/indexed (TEXT) and that documents were added successfully.

Related docs