public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
* [LSF/MM/BPF TOPIC] A block level, active-active replication solution
@ 2026-02-03 15:09 Haris Iqbal
  2026-02-03 18:01 ` Bart Van Assche
  2026-02-13 17:32 ` Bart Van Assche
  0 siblings, 2 replies; 8+ messages in thread
From: Haris Iqbal @ 2026-02-03 15:09 UTC (permalink / raw)
  To: lsf-pc, linux-block; +Cc: Jia Li

Hi,

We are working on a pair of kernel modules which would offer a new
replication solution in the Linux kernel. It would be a block level,
active-active replication solution for RDMA transport.

The existing block level replication solution in the Linux kernel is
DRBD, which is an active-passive solution. The data replication in
DRBD happens through 2 network hops.

An active-active solution which one can build is by exporting block
devices, either through NVMeOF or RNBD/RTRS, over the network, and
then creating a raid1 device over it. It would provide a single hop
replication solution, but the synchronization during a degraded state
goes through 2 hops.

The proposed solution would provide an active-active single hop
replication, and a single hop synchronization (directly between
storage nodes) in case of a degraded state.

The first kernel module is Reliable Multicast on top of RTRS (RMR),
which uses the existing RTRS kernel module in the RDMA subsystem. RMR
works in a client-server architecture, with the server module residing
on the storage nodes. RMR uses the transport ulp RTRS to guarantee
delivery of IO to a group of hosts; And also provides data recovery if
one host in the group misses some IOs. The data recovery is handled by
the RMR server module, directly between the storage nodes.

The second one is BRMR, which is a network block device over RMR. BRMR
provides mirroring functionality and supports replacement of disks.

The proposed solution tracks dirty IOs through a dirty map, and has
internal mechanisms to prevent data corruption in case of crashes,
similar to the activity log in DRBD.

We would like to present the idea and internal workings of the
solution, and also discuss design and some benchmarking results
(comparison with RAID1 over RNBD/NVMeOF devices, or DRBD) during
LSF/MM/BPF. We also want to get feedback, and potentially get more
people involved in the project.

(BRMR/RMR are in-development modules, and we plan to push them to
GitHub or somewhere else before the LSF/MM/BPF summit)

Regards
- Haris

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2026-02-19 10:43 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-03 15:09 [LSF/MM/BPF TOPIC] A block level, active-active replication solution Haris Iqbal
2026-02-03 18:01 ` Bart Van Assche
2026-02-03 18:04   ` Haris Iqbal
2026-02-10 13:06     ` Haris Iqbal
2026-02-10 18:31       ` Bart Van Assche
2026-02-13 14:13         ` Haris Iqbal
2026-02-13 17:32 ` Bart Van Assche
2026-02-19 10:43   ` Haris Iqbal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox