All of lore.kernel.org
 help / color / mirror / Atom feed
* Moving from qemu-pr-helper and libmpathpersist to <linux/pr.h>
@ 2026-01-27 18:47 Stefan Hajnoczi
  2026-01-27 19:45 ` Paolo Bonzini
  2026-01-27 21:06 ` Benjamin Marzinski
  0 siblings, 2 replies; 25+ messages in thread
From: Stefan Hajnoczi @ 2026-01-27 18:47 UTC (permalink / raw)
  To: Benjamin Marzinski, Paolo Bonzini
  Cc: qemu-block, Kevin Wolf, Hannes Reinecke, afaria, qemu-devel

[-- Attachment #1: Type: text/plain, Size: 2436 bytes --]

Hi Benjamin and Paolo,
I would like to discuss changes to DM-Multipath and qemu-pr-helper to
handle SCSI Persistent Reservations in QEMU without privileged code.

SCSI Persistent Reservations support in QEMU is built on the
qemu-pr-helper daemon that performs PERSISTENT RESERVATION IN and
PERSISTENT RESERVATION OUT commands on behalf of the guest. The
qemu-pr-helper process provides privilege separation for ioctl(SG_IO)'s
CAP_SYS_RAWIO and libmpathpersist's root privileges since the main QEMU
process should not have those privileges.

There are issues with the current approach:
- Privileged code is a security attack surface.
- A bunch of code is required for privilege separation and for management
  tools to set up qemu-pr-helper with access to multipathd.
- The interface is SCSI-specific and does not support NVMe.

Several of us have pondered a different approach that I will summarize
here. The <linux/pr.h> ioctl interface provides an alternative to
ioctl(SG_IO) without the CAP_SYS_RAWIO requirement. It supports both
SCSI and NVMe. Since privileges are not required, there would be no need
for the qemu-pr-helper daemon anymore.

The blocker is that <linux/pr.h> is not usable in multipath
environments. The Linux DM-Multipath driver has an incomplete ioctl
implementation that falls short of what libmpathpersist and multipathd
do in userspace. Kernel changes are necessary to fix this.

My suggestion is to implement <linux/pr.h> via upcalls from DM-Multipath
to multipathd. That way applications like QEMU can consistently use
<linux/pr.h> across block device types and no longer have to go through
the privileged libmpathpersist interface.

Once DM-Multipath support <linux/pr.h> is functional, the main QEMU
process can directly invoke the ioctls. qemu-pr-helper will no longer be
needed, eliminating privileged code and simplifying the setup required
by management tools such as libvirt and KubeVirt.

The only loss in functionality that I have identified when switching to
<linux/pr.h> is that qemu-pr-helper supports SCSI TransportIDs for the
PERSISTENT RESERVATION OUT command. This is not supported by
<linux/pr.h>, but I'm not sure how this even works today since the guest
sees a virtual SCSI bus and is unaware of the physical bus or HBA. So
maybe that was never used in the first place?

Does this plan sound good to you?

Benjamin: I can work on the DM-Multipath upcalls if you are busy.

Thanks,
Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2026-02-10 14:30 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-01-27 18:47 Moving from qemu-pr-helper and libmpathpersist to <linux/pr.h> Stefan Hajnoczi
2026-01-27 19:45 ` Paolo Bonzini
2026-01-28 14:18   ` Stefan Hajnoczi
2026-01-28 15:30     ` Hannes Reinecke
2026-01-28 16:13       ` Stefan Hajnoczi
2026-01-27 21:06 ` Benjamin Marzinski
2026-02-03 15:09   ` Stefan Hajnoczi
2026-02-03 17:53     ` Benjamin Marzinski
2026-02-03 18:04       ` Stefan Hajnoczi
2026-02-04 13:19         ` Martin Wilck
2026-02-04 18:32           ` Stefan Hajnoczi
2026-02-04 23:57             ` Hannes Reinecke
2026-02-05  1:03               ` Benjamin Marzinski
2026-02-05 10:20                 ` Martin Wilck
2026-02-05 11:52             ` Martin Wilck
2026-02-05 12:01               ` Daniel P. Berrangé
2026-02-05 13:39                 ` Stefan Hajnoczi
2026-02-06  0:03                   ` Hannes Reinecke
2026-02-06 14:08                     ` Stefan Hajnoczi
2026-02-09 12:50                       ` Hannes Reinecke
2026-02-09 14:23                         ` Stefan Hajnoczi
2026-02-10 10:23                           ` Martin Wilck
2026-02-10 13:59                             ` Stefan Hajnoczi
2026-02-10 14:29                               ` Martin Wilck
2026-02-05 14:28               ` Stefan Hajnoczi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.