myfs-hub: in-memory MapDB fallback (no sqlite_path) silently discards all uploaded blocks and reports false block_exists #39

Open
opened 2026-05-12 08:06:50 +00:00 by rawan · 1 comment
Member

Summary

When the server config has no sqlite_path, myfs-hub falls back to db::DBType::MapDB. MapDB is a stub: it stores nothing and returns placeholder values. The server still answers 201 Created to POST /api/v1/block and POST /api/v1/file, so operators get a server that accepts uploads and silently throws the data away — with no error anywhere.

Evidence

myfs-hub/src/server/mod.rs:

let db: Arc<db::DBType> = if let Some(sqlite_path) = &config.sqlite_path {
    ...
    Arc::new(db::DBType::SqlDB(db::sqlite::SqlDB::new(...).await))
} else {
    log::info!("Using in-memory MapDB database");
    Arc::new(db::DBType::MapDB(db::map::MapDB::new(&config.users.clone())))
};

myfs-hub/src/server/db/map.rs:

async fn block_exists(&self, ...) -> bool { /* TODO: */ true }
async fn store_block(&self, ...) -> Result<bool> { /* TODO: Implement block storage logic */ Ok(true) }
async fn get_block(&self, _hash: &str) -> Result<Option<Vec<u8>>> { /* TODO: */ Ok(None) }
async fn get_file_by_hash(&self, _hash: &str) -> Result<Option<File>> { /* TODO: */ Ok(None) }
async fn get_file_blocks_ordered(&self, _file_hash: &str) -> Result<Vec<(String, u64)>> { /* TODO: */ Ok(Vec::new()) }
async fn list_blocks(&self, ...) -> Result<(Vec<String>, u64)> { /* TODO: */ Ok((Vec::new(), 0)) }

Why this is bad

  • store_blockOk(true): upload "succeeds", data gone.
  • block_existstrue: clients that check-before-upload (e.g. myfs upload / verify_blocks_handler) conclude the block is already on the server and skip it — so even a later switch to a real DB won't have the data.
  • get_block / get_file_by_hashNone: every download 404s, with no hint that the server was misconfigured.
  • config.rs does not require sqlite_path, so this is the default for any config that omits it.

Suggested fix

Either:

  1. Remove the MapDB fallback and make sqlite_path mandatory in parse_config (fail fast with a clear config error); or
  2. Make MapDB a real in-memory store (back the block/file maps with HashMaps, implement store_block/get_block/block_exists/etc.) so it behaves correctly for tests and ephemeral use.

Option (1) is safest for production; (2) is nicer for tests. Whatever the choice, store_block must not return Ok(true) while doing nothing.

## Summary When the server config has no `sqlite_path`, `myfs-hub` falls back to `db::DBType::MapDB`. `MapDB` is a stub: it stores **nothing** and returns placeholder values. The server still answers `201 Created` to `POST /api/v1/block` and `POST /api/v1/file`, so operators get a server that accepts uploads and silently throws the data away — with no error anywhere. ## Evidence `myfs-hub/src/server/mod.rs`: ```rust let db: Arc<db::DBType> = if let Some(sqlite_path) = &config.sqlite_path { ... Arc::new(db::DBType::SqlDB(db::sqlite::SqlDB::new(...).await)) } else { log::info!("Using in-memory MapDB database"); Arc::new(db::DBType::MapDB(db::map::MapDB::new(&config.users.clone()))) }; ``` `myfs-hub/src/server/db/map.rs`: ```rust async fn block_exists(&self, ...) -> bool { /* TODO: */ true } async fn store_block(&self, ...) -> Result<bool> { /* TODO: Implement block storage logic */ Ok(true) } async fn get_block(&self, _hash: &str) -> Result<Option<Vec<u8>>> { /* TODO: */ Ok(None) } async fn get_file_by_hash(&self, _hash: &str) -> Result<Option<File>> { /* TODO: */ Ok(None) } async fn get_file_blocks_ordered(&self, _file_hash: &str) -> Result<Vec<(String, u64)>> { /* TODO: */ Ok(Vec::new()) } async fn list_blocks(&self, ...) -> Result<(Vec<String>, u64)> { /* TODO: */ Ok((Vec::new(), 0)) } ``` ## Why this is bad - `store_block` → `Ok(true)`: upload "succeeds", data gone. - `block_exists` → `true`: clients that check-before-upload (e.g. `myfs upload` / `verify_blocks_handler`) conclude the block is already on the server and skip it — so even a later switch to a real DB won't have the data. - `get_block` / `get_file_by_hash` → `None`: every download 404s, with no hint that the server was misconfigured. - `config.rs` does not require `sqlite_path`, so this is the default for any config that omits it. ## Suggested fix Either: 1. **Remove the `MapDB` fallback** and make `sqlite_path` mandatory in `parse_config` (fail fast with a clear config error); or 2. **Make `MapDB` a real in-memory store** (back the block/file maps with `HashMap`s, implement `store_block`/`get_block`/`block_exists`/etc.) so it behaves correctly for tests and ephemeral use. Option (1) is safest for production; (2) is nicer for tests. Whatever the choice, `store_block` must not return `Ok(true)` while doing nothing.
Author
Member

Spec: Implement MapDB as a real in-memory store

Problem

MapDB in myfs-hub/src/server/db/map.rs is a stub: every method either returns a
hardcoded placeholder or does nothing. This means a server configured without
sqlite_path silently discards uploaded blocks and pretends they exist
(block_exists returns true).

Solution

Replace the stub implementations with real HashMap-backed storage. The MapDB
struct gains three maps and a download counter. Thread safety is provided by
std::sync::Mutex — MapDB is already behind Arc<DBType> at the call site and
all methods are &self, so interior mutability is required.

MapDB struct (changed)

pub struct MapDB {
    users: HashMap<String, User>,
    blocks: Mutex<HashMap<String, Vec<u8>>>,
    block_meta: Mutex<HashMap<String, (String, u64, i64)>>,  // block_hash -> (file_hash, block_index, user_id)
    files: Mutex<HashMap<String, File>>,
    downloads: Mutex<HashMap<String, (u64, u64)>>,           // hash -> (download_count, total_bytes)
}
  • Mutex<HashMap<...>> for each mutable collection — fine granularity avoids a
    single big lock.
  • users stays immutable after construction (no user registration at runtime).

Method implementations

Method Implementation
block_exists Check block_meta for the given block_hash. Return true if present AND matches file_hash, block_index, user_id.
store_block Insert into blocks (key = block_hash, value = data). Insert into block_meta. Return Ok(true) on first insert, Ok(false) if already present.
get_block Look up hash in blocks. Return Ok(Some(data)) or Ok(None).
get_file_by_hash Look up hash in files. Return Ok(Some(file)) or Ok(None).
get_file_blocks_ordered Iterate block_meta filtered by file_hash, collect (block_hash, block_index) sorted by index.
list_blocks Iterate all blocks keys, apply pagination (page / per_page).
get_user_blocks Iterate block_meta filtered by user_id, apply pagination.
increment_block_downloads Upsert downloads[hash]: increment count, add data size.
get_block_downloads Return downloads.get(hash).cloned().unwrap_or((0, 0)).

Pagination helper

A private helper:

fn paginate<T>(items: &[T], page: u32, per_page: u32) -> (Vec<T>, u64)

Clips page and per_page to sensible bounds (page >= 1, per_page 1..100),
slices the sorted list, and returns the total count.

Files changed

Only one file: myfs-hub/src/server/db/map.rs.

No changes to config.rs, mod.rs, server/mod.rs, or any other file — the
MapDB constructor signature stays the same, and DBType::MapDB dispatches
through the existing trait impl.

Risk

None beyond the scope of MapDB itself — SqlDB and the rest of the codebase are
untouched. No change to the config schema or public API.

## Spec: Implement MapDB as a real in-memory store ### Problem `MapDB` in `myfs-hub/src/server/db/map.rs` is a stub: every method either returns a hardcoded placeholder or does nothing. This means a server configured without `sqlite_path` silently discards uploaded blocks and pretends they exist (`block_exists` returns `true`). ### Solution Replace the stub implementations with real `HashMap`-backed storage. The MapDB struct gains three maps and a download counter. Thread safety is provided by `std::sync::Mutex` — MapDB is already behind `Arc<DBType>` at the call site and all methods are `&self`, so interior mutability is required. ### MapDB struct (changed) ```rust pub struct MapDB { users: HashMap<String, User>, blocks: Mutex<HashMap<String, Vec<u8>>>, block_meta: Mutex<HashMap<String, (String, u64, i64)>>, // block_hash -> (file_hash, block_index, user_id) files: Mutex<HashMap<String, File>>, downloads: Mutex<HashMap<String, (u64, u64)>>, // hash -> (download_count, total_bytes) } ``` - `Mutex<HashMap<...>>` for each mutable collection — fine granularity avoids a single big lock. - `users` stays immutable after construction (no user registration at runtime). ### Method implementations | Method | Implementation | |---|---| | `block_exists` | Check `block_meta` for the given `block_hash`. Return `true` if present AND matches `file_hash`, `block_index`, `user_id`. | | `store_block` | Insert into `blocks` (key = block_hash, value = data). Insert into `block_meta`. Return `Ok(true)` on first insert, `Ok(false)` if already present. | | `get_block` | Look up `hash` in `blocks`. Return `Ok(Some(data))` or `Ok(None)`. | | `get_file_by_hash` | Look up `hash` in `files`. Return `Ok(Some(file))` or `Ok(None)`. | | `get_file_blocks_ordered` | Iterate `block_meta` filtered by `file_hash`, collect `(block_hash, block_index)` sorted by index. | | `list_blocks` | Iterate all `blocks` keys, apply pagination (`page` / `per_page`). | | `get_user_blocks` | Iterate `block_meta` filtered by `user_id`, apply pagination. | | `increment_block_downloads` | Upsert `downloads[hash]`: increment count, add data size. | | `get_block_downloads` | Return `downloads.get(hash).cloned().unwrap_or((0, 0))`. | ### Pagination helper A private helper: ```rust fn paginate<T>(items: &[T], page: u32, per_page: u32) -> (Vec<T>, u64) ``` Clips `page` and `per_page` to sensible bounds (page >= 1, per_page 1..100), slices the sorted list, and returns the total count. ### Files changed Only one file: `myfs-hub/src/server/db/map.rs`. No changes to `config.rs`, `mod.rs`, `server/mod.rs`, or any other file — the MapDB constructor signature stays the same, and `DBType::MapDB` dispatches through the existing trait impl. ### Risk None beyond the scope of MapDB itself — SqlDB and the rest of the codebase are untouched. No change to the config schema or public API.
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
geomind_code/my_fs#39
No description provided.