No description
Find a file
Jan De Landtsheer b75c127a55
feat(rarakoon-node): add Collapser task for WAL compaction (WP-013.1)
Add Collapser struct with run_once() method that checks if the WAL has
grown beyond a configurable threshold past the last snapshot, and if so
creates a new head snapshot, archives completed segments, and prunes old
segments while respecting segments_to_keep.

Tests: collapser_triggers_on_threshold, collapser_no_op_below_threshold,
collapser_archives_old_segments, collapser_respects_segments_to_keep,
collapser_idempotent
2026-02-14 09:12:41 +01:00
.ralph feat(rarakoon-node): add batched entry transfer source side (WP-012.4) 2026-02-14 08:46:50 +01:00
crates feat(rarakoon-node): add Collapser task for WAL compaction (WP-013.1) 2026-02-14 09:12:41 +01:00
docs/adr Rename arakoon to rarakoon, add integration tests and crate docs 2026-02-13 18:03:31 +01:00
.gitignore Rename arakoon to rarakoon, add integration tests and crate docs 2026-02-13 18:03:31 +01:00
Cargo.lock test(integration): add turmoil_bounce_with_head for ADR-011 gate 2026-02-13 23:26:46 +01:00
Cargo.toml feat(head-db): add head database crate for on-disk KV snapshots 2026-02-13 23:15:43 +01:00
README.md Rename arakoon to rarakoon, add integration tests and crate docs 2026-02-13 18:03:31 +01:00

Rarakoon

Rarakoon is a Rust port of Arakoon, a distributed key-value store with preference for consistency, implementing the Multi-Paxos consensus protocol.

The original Arakoon is written in OCaml on top of Tokyo Cabinet. This port replaces OCaml/Lwt with Rust/Tokio, Tokyo Cabinet with an in-memory BTreeMap, and OCaml serialization with rkyv zero-copy deserialization.

Architecture

The workspace is split into 10 crates with the following dependency graph:

rarakoon-bin
  |
  v
rarakoon-node
  |
  +---> paxos-core          (pure Multi-Paxos FSM, no I/O)
  +---> kv-store             (BTreeMap KV with ACID transactions)
  +---> write-ahead-log      (segment WAL with CRC32C + zstd)
  +---> tcp-messenger        (async node-to-node TCP messaging)
  |       +---> connection-fsm   (pure TCP connection state machine)
  |       +---> leaky-buffer     (bounded async MPSC with drop semantics)
  |       +---> wire-codec       (rkyv length-prefixed framing)
  +---> rarakoon-client      (client wire protocol codec)
Crate Purpose
leaky-buffer Bounded async MPSC channel that drops on overflow
connection-fsm Pure TCP connection state machine with backoff
wire-codec rkyv-based length-prefixed frame codec
write-ahead-log Segment-based WAL with CRC32C and zstd compression
kv-store In-memory ordered KV store with transactions
paxos-core Pure Multi-Paxos consensus FSM (no async, no I/O)
tcp-messenger Async node-to-node TCP messaging layer
rarakoon-client Client protocol codec (binary wire format)
rarakoon-node Node driver wiring FSM to I/O
rarakoon-bin CLI entry point

Build and Test

cargo build --release
cargo test --workspace
cargo clippy --workspace

Run

cargo run --release --bin rarakoon-bin -- \
    --config cluster.toml \
    --node node_0 \
    --log-format pretty

Configuration

Rarakoon uses TOML configuration files. Example:

[cluster]
id = "my-cluster"
master_mode = "elected"     # "elected" or "readonly"
lease_period_secs = 10

[[nodes]]
name = "node_0"
addresses = ["127.0.0.1:4010"]
client_port = 4000
home_dir = "/var/lib/rarakoon/node_0"
compressor = "zstd"

[[nodes]]
name = "node_1"
addresses = ["127.0.0.1:4011"]
client_port = 4001
home_dir = "/var/lib/rarakoon/node_1"
compressor = "zstd"

[[nodes]]
name = "node_2"
addresses = ["127.0.0.1:4012"]
client_port = 4002
home_dir = "/var/lib/rarakoon/node_2"
compressor = "zstd"

Each node exposes two ports:

  • messaging port (configured in addresses): inter-node Paxos communication
  • client port: client operations (Get, Set, Delete, etc.)

Key Design Decisions

  • Pure consensus core: paxos-core has zero dependencies on async runtimes, serialization libraries, or I/O. It is a pure step(context, state, event) -> (state, effects) function, making it easy to test and reason about.
  • Leaky back-pressure: Inter-node message queues drop new messages when full instead of blocking the sender, preventing cascading slowdowns in the cluster.
  • Zero-copy wire format: Node-to-node messages use rkyv for zero-copy deserialization, while the client protocol preserves Arakoon's original binary format for compatibility.

License

Apache-2.0