From: Stefan Hajnoczi <stefanha@redhat.com>
To: mr-083 <matthieu@minio.io>
Cc: qemu-devel@nongnu.org, qemu-block@nongnu.org, its@irrelevant.dk,
kbusch@kernel.org, mr-083 <matthieu@min.io>
Subject: Re: [PATCH v2] hw/nvme: add namespace hotplug support
Date: Fri, 10 Apr 2026 08:41:21 -0400 [thread overview]
Message-ID: <20260410124121.GA485553@fedora> (raw)
In-Reply-To: <20260409213454.5186-1-matthieu@min.io>
[-- Attachment #1: Type: text/plain, Size: 8180 bytes --]
On Thu, Apr 09, 2026 at 11:34:51PM +0200, mr-083 wrote:
> Add hotplug support for nvme-ns devices on the NvmeBus. This enables
> namespace-level hot-swap without removing the NVMe controller, matching
> the behavior of physical NVMe drives hot-swapped in the same PCIe slot.
If we rely purely on NVMe's AEN then this is not equivalent to swapping
physical drives in the same PCIe slot. Maybe adjust the wording to
reflect that this is NVMe-level Namespace hotplug?
> Mark nvme-ns devices as hotpluggable and register the NvmeBus as a
> hotplug handler with proper plug and unplug callbacks:
>
> - plug: attach namespace to all started controllers and send an
> Asynchronous Event Notification (AEN) with NS_ATTR_CHANGED so
> the guest kernel rescans namespaces and adds the block device
> - unplug: detach from all controllers, send AEN, remove from
> subsystem, then unrealize the device. The guest kernel rescans
> and removes the block device.
>
> The plug handler skips controllers that haven't started yet
> (qs_created == false) to avoid interfering with boot-time namespace
> attachment in nvme_start_ctrl().
>
> Both the controller bus and subsystem bus are configured as hotplug
> handlers via qbus_set_bus_hotplug_handler() since nvme-ns devices
> may reparent to the subsystem bus during realize.
>
> Example hot-swap sequence using the NVMe subsystem model:
>
> # Boot with: -device nvme-subsys,id=subsys0
> # -device nvme,id=ctrl0,subsys=subsys0
> # -device nvme-ns,id=ns0,drive=drv0,bus=ctrl0,nsid=1
>
> device_del ns0 # guest receives AEN, removes /dev/nvme0n1
> drive_del drv0
> drive_add 0 file=disk.qcow2,format=qcow2,id=drv0,if=none
> device_add nvme-ns,id=ns0,drive=drv0,bus=ctrl0,nsid=1
> # guest receives AEN, adds /dev/nvme0n1
>
> Tested with Linux 6.1 guest (NVMe driver processes AEN and rescans
> namespace list automatically).
Did you test a Windows Server guest? If not, I can try that next week in
case there are any surprises.
>
> Signed-off-by: Matthieu Receveur <matthieu@min.io>
> ---
> hw/nvme/ctrl.c | 85 ++++++++++++++++++++++++++++++++++++++++++++++++
> hw/nvme/ns.c | 1 +
> hw/nvme/subsys.c | 2 ++
> 3 files changed, 88 insertions(+)
>
> diff --git a/hw/nvme/ctrl.c b/hw/nvme/ctrl.c
> index be6c7028cb..5502e4ea2b 100644
> --- a/hw/nvme/ctrl.c
> +++ b/hw/nvme/ctrl.c
> @@ -206,6 +206,7 @@
> #include "system/hostmem.h"
> #include "hw/pci/msix.h"
> #include "hw/pci/pcie_sriov.h"
> +#include "hw/core/qdev.h"
> #include "system/spdm-socket.h"
> #include "migration/vmstate.h"
>
> @@ -9293,6 +9294,7 @@ static void nvme_realize(PCIDevice *pci_dev, Error **errp)
> }
>
> qbus_init(&n->bus, sizeof(NvmeBus), TYPE_NVME_BUS, dev, dev->id);
> + qbus_set_bus_hotplug_handler(BUS(&n->bus));
>
> if (nvme_init_subsys(n, errp)) {
> return;
> @@ -9553,10 +9555,93 @@ static const TypeInfo nvme_info = {
> },
> };
>
> +static void nvme_ns_hot_plug(HotplugHandler *hotplug_dev, DeviceState *dev,
> + Error **errp)
> +{
> + NvmeNamespace *ns = NVME_NS(dev);
> + NvmeSubsystem *subsys = ns->subsys;
> + uint32_t nsid = ns->params.nsid;
> + int i;
> +
> + /*
> + * Attach to all started controllers and notify via AEN.
> + * Skip controllers that haven't started yet (boot-time realize) —
> + * nvme_start_ctrl() will attach namespaces during controller init.
> + */
> + for (i = 0; i < NVME_MAX_CONTROLLERS; i++) {
> + NvmeCtrl *ctrl = nvme_subsys_ctrl(subsys, i);
> + if (!ctrl || !ctrl->qs_created) {
> + continue;
> + }
> +
> + if (nvme_csi_supported(ctrl, ns->csi) && !ns->params.detached) {
> + nvme_attach_ns(ctrl, ns);
> + nvme_update_dsm_limits(ctrl, ns);
> +
> + if (!test_and_set_bit(nsid, ctrl->changed_nsids)) {
> + nvme_enqueue_event(ctrl, NVME_AER_TYPE_NOTICE,
> + NVME_AER_INFO_NOTICE_NS_ATTR_CHANGED,
> + NVME_LOG_CHANGED_NSLIST);
> + }
> + }
> + }
> +}
> +
> +static void nvme_ns_hot_unplug(HotplugHandler *hotplug_dev, DeviceState *dev,
> + Error **errp)
> +{
> + NvmeNamespace *ns = NVME_NS(dev);
> + NvmeSubsystem *subsys = ns->subsys;
> + uint32_t nsid = ns->params.nsid;
> + int i;
While there is qdev_unrealize -> nvme_ns_unrealize -> nvme_ns_drain ->
blk_drain to quiesce I/O requests at the end of this function, I wonder
whether it's safe to start removing the namespace before I/O has been
drained.
Did you test hot unplug while the Namespace is under heavy I/O (e.g. fio
job running inside the guest with lots of queued I/O requests)?
It might be necessary to stop I/O first before tearing down the
namespace.
> +
> + /*
> + * Detach from all controllers and notify the guest via AEN.
> + * Must happen before unrealize to avoid use-after-free when the
> + * guest sends I/O to a freed namespace.
> + */
> + for (i = 0; i < NVME_MAX_CONTROLLERS; i++) {
> + NvmeCtrl *ctrl = nvme_subsys_ctrl(subsys, i);
> + if (!ctrl || !nvme_ns(ctrl, nsid)) {
> + continue;
> + }
> +
> + nvme_detach_ns(ctrl, ns);
> + nvme_update_dsm_limits(ctrl, NULL);
> +
> + if (!test_and_set_bit(nsid, ctrl->changed_nsids)) {
> + nvme_enqueue_event(ctrl, NVME_AER_TYPE_NOTICE,
> + NVME_AER_INFO_NOTICE_NS_ATTR_CHANGED,
> + NVME_LOG_CHANGED_NSLIST);
> + }
> + }
> +
> + /* Remove from subsystem namespace list. */
> + subsys->namespaces[nsid] = NULL;
The dual of this operation is done in nvme_ns_realize():
subsys->namespaces[nsid] = ns;
Maybe nvme_ns_unrealize() should remove the namespace from the
subsystem for consistency? I guess the lack of removal was never an
issue before hot unplug, but now it would be nice to implement the
lifecycle.
> +
> + /*
> + * Unrealize: drain I/O, flush, cleanup structures, remove from QOM.
> + * nvme_ns_unrealize() handles drain/shutdown/cleanup internally.
> + */
> + qdev_unrealize(dev);
> +}
> +
> +static void nvme_bus_class_init(ObjectClass *klass, const void *data)
> +{
> + HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(klass);
> + hc->plug = nvme_ns_hot_plug;
> + hc->unplug = nvme_ns_hot_unplug;
> +}
> +
> static const TypeInfo nvme_bus_info = {
> .name = TYPE_NVME_BUS,
> .parent = TYPE_BUS,
> .instance_size = sizeof(NvmeBus),
> + .class_init = nvme_bus_class_init,
> + .interfaces = (const InterfaceInfo[]) {
> + { TYPE_HOTPLUG_HANDLER },
> + { }
> + },
> };
>
> static void nvme_register_types(void)
> diff --git a/hw/nvme/ns.c b/hw/nvme/ns.c
> index b0106eaa5c..eb628c0734 100644
> --- a/hw/nvme/ns.c
> +++ b/hw/nvme/ns.c
> @@ -937,6 +937,7 @@ static void nvme_ns_class_init(ObjectClass *oc, const void *data)
> dc->bus_type = TYPE_NVME_BUS;
> dc->realize = nvme_ns_realize;
> dc->unrealize = nvme_ns_unrealize;
> + dc->hotpluggable = true;
> device_class_set_props(dc, nvme_ns_props);
> dc->desc = "Virtual NVMe namespace";
> }
> diff --git a/hw/nvme/subsys.c b/hw/nvme/subsys.c
> index 777e1c620f..fa35055d3c 100644
> --- a/hw/nvme/subsys.c
> +++ b/hw/nvme/subsys.c
> @@ -9,6 +9,7 @@
> #include "qemu/osdep.h"
> #include "qemu/units.h"
> #include "qapi/error.h"
> +#include "hw/core/qdev.h"
>
> #include "nvme.h"
>
> @@ -205,6 +206,7 @@ static void nvme_subsys_realize(DeviceState *dev, Error **errp)
> NvmeSubsystem *subsys = NVME_SUBSYS(dev);
>
> qbus_init(&subsys->bus, sizeof(NvmeBus), TYPE_NVME_BUS, dev, dev->id);
> + qbus_set_bus_hotplug_handler(BUS(&subsys->bus));
>
> nvme_subsys_setup(subsys, errp);
> }
> --
> 2.53.0
>
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
next prev parent reply other threads:[~2026-04-10 12:42 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-09 6:01 [PATCH 0/2] NVMe namespace hotplug and drive reconnection support mr-083
2026-04-09 6:01 ` [PATCH 1/2] hw/nvme: add namespace hotplug support mr-083
2026-04-09 6:01 ` [PATCH 2/2] block/monitor: add drive_insert HMP command mr-083
2026-04-14 17:57 ` Stefan Hajnoczi
2026-04-14 18:02 ` Matthieu Rolla
2026-04-14 19:05 ` Warner Losh
2026-04-14 21:01 ` Matthieu Rolla
2026-04-15 10:48 ` Daniel P. Berrangé
2026-04-15 12:32 ` Matthieu Rolla
2026-04-16 19:52 ` Stefan Hajnoczi
2026-04-16 22:00 ` Matthieu Rolla
2026-04-15 12:33 ` Stefan Hajnoczi
2026-04-09 21:34 ` [PATCH v2] hw/nvme: add namespace hotplug support mr-083
2026-04-10 12:41 ` Stefan Hajnoczi [this message]
2026-04-10 14:30 ` [PATCH v3] " mr-083
2026-04-10 14:33 ` Matthieu Rolla
2026-04-10 20:14 ` Stefan Hajnoczi
2026-04-13 15:24 ` Matthieu Rolla
2026-04-13 17:17 ` [PATCH 0/2] NVMe namespace hotplug and drive reconnection support Klaus Jensen
2026-04-14 12:42 ` Stefan Hajnoczi
2026-04-14 13:36 ` Matthieu Rolla
2026-04-14 18:09 ` Keith Busch
2026-04-14 18:10 ` Stefan Hajnoczi
2026-04-14 18:14 ` Matthieu Rolla
2026-04-15 12:45 ` Stefan Hajnoczi
2026-04-15 17:39 ` Matthieu Rolla
2026-04-14 14:04 ` John Meneghini
2026-04-16 10:11 ` Nilay Shroff
2026-04-16 12:33 ` Matthieu Rolla
2026-04-14 14:42 ` Keith Busch
2026-04-15 17:38 ` [PATCH v4] hw/nvme: add namespace hotplug support mr-083
2026-04-16 19:42 ` Stefan Hajnoczi
2026-04-17 9:29 ` Klaus Jensen
2026-04-17 9:45 ` Matthieu Rolla
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260410124121.GA485553@fedora \
--to=stefanha@redhat.com \
--cc=its@irrelevant.dk \
--cc=kbusch@kernel.org \
--cc=matthieu@min.io \
--cc=matthieu@minio.io \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.