[PERF] Shard RoutingTable to reduce lock contention #17

Closed
opened 2026-02-11 19:23:47 +00:00 by thabeta · 1 comment
Owner

Description

The global RwLock on the RoutingTable causes thread contention during high-frequency route updates.

Proposed Fix

Implement sharding by subnet prefix or hash to allow multiple workers to update segments of the table concurrently.

### Description The global `RwLock` on the `RoutingTable` causes thread contention during high-frequency route updates. ### Proposed Fix Implement sharding by subnet prefix or hash to allow multiple workers to update segments of the table concurrently.
Owner

The full architecture of the routing table is as follows:

The table (prefix trie) itself is a left-right datastructure. There are 2 copies of the trie. Internally there is a pointer to the readable copy, and a pointer to the writeable copy. Every time a read is done, this goes through the readable copy. When a new subnet is inserted/old one deleted, the read pointer moves to the other copy. Once all existing reads are done, the modification/write progresses and modifies the first copy (which now no longer has any readers). After the write has finished, the readers move back ot this copy and once no old reader exists on the second copy, the write is applied to said copy. This way reads can always progress even if a write is currently in progress.

The value associated with every subnet is held in an Arc, so we cheaply hand out copies to it. Modification to the route list of a the subnet are done through a closure, which is applied in a CAS fashion on the Arc. Essentially, on every modification to the list, a copy of the current list is made, which is then modified, and swapped with the original (CAS style). If the CAS op fails, a new copy of the current route list is obtained and the closure is applied again, until it succeeds.

This means that essentially al writes to the routing table are deterministic and not racing.

Furthermore, the handler for updates can spawn on multiple tasks (which should get distributed over multiple threads). Assignment to a task is based on the (hash of) the relevant subnet, meaning there is a deterministic mapping between subnets and the task that handles them. This generally avoids race conditions in the CAS operation (CAS should always succeed first try so there is no overhead of retries), while also allowing multiple updates to be handled fully in parallel (especially if the subnet is known to the peer, as the only actual lock taken is when a new subnet is inserted/existing subnet is removed).

For other data structures which map subnets to data, we generally use dashmap, which is internally sharded to allow for parallel writes most of the time

The full architecture of the routing table is as follows: The table (prefix trie) itself is a left-right datastructure. There are 2 copies of the trie. Internally there is a pointer to the readable copy, and a pointer to the writeable copy. Every time a read is done, this goes through the readable copy. When a new subnet is inserted/old one deleted, the read pointer moves to the other copy. Once all existing reads are done, the modification/write progresses and modifies the first copy (which now no longer has any readers). After the write has finished, the readers move back ot this copy and once no old reader exists on the second copy, the write is applied to said copy. This way reads can always progress even if a write is currently in progress. The value associated with every subnet is held in an Arc, so we cheaply hand out copies to it. Modification to the route list of a the subnet are done through a closure, which is applied in a CAS fashion on the Arc. Essentially, on every modification to the list, a copy of the current list is made, which is then modified, and swapped with the original (CAS style). If the CAS op fails, a new copy of the current route list is obtained and the closure is applied again, until it succeeds. This means that essentially al writes to the routing table are deterministic and not racing. Furthermore, the handler for updates can spawn on multiple tasks (which should get distributed over multiple threads). Assignment to a task is based on the (hash of) the relevant subnet, meaning there is a deterministic mapping between subnets and the task that handles them. This generally avoids race conditions in the CAS operation (CAS should always succeed first try so there is no overhead of retries), while also allowing multiple updates to be handled fully in parallel (especially if the subnet is known to the peer, as the only actual lock taken is when a new subnet is inserted/existing subnet is removed). For other data structures which map subnets to data, we generally use dashmap, which is internally sharded to allow for parallel writes most of the time
lee closed this issue 2026-03-20 11:33:53 +00:00
thabeta added this to the ACTIVE project 2026-03-23 14:27:06 +00:00
Sign in to join this conversation.
No labels
Urgent
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
geomind_code/mycelium_network#17
No description provided.