qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Klaus Jensen <its@irrelevant.dk>
To: Hannes Reinecke <hare@suse.de>
Cc: Damien Hedde <dhedde@kalrayinc.com>,
	qemu-block@nongnu.org, Keith Busch <kbusch@kernel.org>,
	qemu-devel <qemu-devel@nongnu.org>,
	Titouan Huard <thuard@kalrayinc.com>
Subject: Re: NVME hotplug support ?
Date: Wed, 24 Jan 2024 08:39:09 +0100	[thread overview]
Message-ID: <ZbC-nSxMTQ6RveHG@cormorant.local> (raw)
In-Reply-To: <499096d7-1b4d-471b-9abf-5b6f72bb7990@suse.de>

[-- Attachment #1: Type: text/plain, Size: 3284 bytes --]

On Jan 23 13:40, Hannes Reinecke wrote:
> On 1/23/24 11:59, Damien Hedde wrote:
> > Hi all,
> > 
> > We are currently looking into hotplugging nvme devices and it is currently not possible:
> > When nvme was introduced 2 years ago, the feature was disabled.
> > > commit cc6fb6bc506e6c47ed604fcb7b7413dff0b7d845
> > > Author: Klaus Jensen
> > > Date:   Tue Jul 6 10:48:40 2021 +0200
> > > 
> > >     hw/nvme: mark nvme-subsys non-hotpluggable
> > >     We currently lack the infrastructure to handle subsystem hotplugging, so
> > >     disable it.
> > 
> > Do someone know what's lacking or anyone have some tips/idea of what we should develop to add the support ?
> > 
> Problem is that the object model is messed up. In qemu namespaces are
> attached to controllers, which in turn are children of the PCI device.
> There are subsystems, but these just reference the controller.
> 
> So if you hotunplug the PCI device you detach/destroy the controller and
> detach the namespaces from the controller.
> But if you hotplug the PCI device again the NVMe controller will be attached
> to the PCI device, but the namespace are still be detached.
> 
> Klaus said he was going to fix that, and I dimly remember some patches
> floating around. But apparently it never went anywhere.
> 
> Fundamental problem is that the NVMe hierarchy as per spec is incompatible
> with the qemu object model; qemu requires a strict
> tree model where every object has exactly _one_ parent.
> 

A little history might help to nuance this just a bit. And to defend the
current model ;)

When we added support for multiple namespaces we did not consider
subsystem support, so the namespaces would just be associated directly
with a parent controller (in QDev terms, the parent has a bus that the
namespace devices are attached to).

When we added subsystems, where namespaces may be attached to several
controllers, it became necessary to break the controller/namespace
parent/child relationship. The problem was that removing the controller
would take all the bus children with it, causing namespaces to be
removed from other controllers in the subsystem. We fixed this by
reparenting the namespaces to the subsystem device instead.

I think this model fits the NVMe hierarchy as good as possible.
Controllers and namespaces are considered children of the subsystem (as
they are in NVMe).

Now, the problem with namespaces not being re-attached is partly false.
If the namespaces are 'shared=on', they will be automatically attached
to any new controller attached to the subsystem. However, if they are
private, that is is not the case. In NVMe, a private namespace just
means a namespace that can only be attached to a single controller at a
time. It is not entirely unlikely that you have a private namespace that
you then reassign to controller B when controller A is removed. Now,
what we could do is track the last controller identifier that a private
namespace was attached to, and if the same controller identifier is
added to the subsystem, we could reattach the private namespace.

However, broadly, I think the current model does a pretty good job in
supporting experimentation with hotplug, multipath and failover
configurations.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

      parent reply	other threads:[~2024-01-24  7:40 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-23 10:59 NVME hotplug support ? Damien Hedde
2024-01-23 11:15 ` Klaus Jensen
2024-01-23 12:40 ` Hannes Reinecke
2024-01-24  6:52   ` Philippe Mathieu-Daudé
2024-01-24  7:47     ` Hannes Reinecke
2024-01-29 13:13       ` Damien Hedde
2024-01-29 13:37         ` Klaus Jensen
2024-01-29 15:35         ` Hannes Reinecke
2024-02-05 13:33           ` Damien Hedde
2024-01-24  7:39   ` Klaus Jensen [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZbC-nSxMTQ6RveHG@cormorant.local \
    --to=its@irrelevant.dk \
    --cc=dhedde@kalrayinc.com \
    --cc=hare@suse.de \
    --cc=kbusch@kernel.org \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=thuard@kalrayinc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).