From: Klaus Jensen <k.jensen@samsung.com>
To: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-block@nongnu.org, qemu-devel@nongnu.org,
"Markus Armbruster" <armbru@redhat.com>,
"Keith Busch" <kbusch@kernel.org>,
"Hanna Reitz" <hreitz@redhat.com>,
"Stefan Hajnoczi" <stefanha@redhat.com>,
"Klaus Jensen" <its@irrelevant.dk>,
"Philippe Mathieu-Daudé" <philmd@redhat.com>
Subject: Re: [PATCH RFC 00/13] hw/nvme: experimental user-creatable objects
Date: Fri, 17 Sep 2021 10:02:38 +0200 [thread overview]
Message-ID: <YURLnv5wq5XMWzHg@apples.localdomain> (raw)
In-Reply-To: <YURGB/6SfQUMu992@redhat.com>
On Sep 17 09:38, Kevin Wolf wrote:
> Am 17.09.2021 um 08:21 hat Klaus Jensen geschrieben:
> > On Sep 16 18:30, Klaus Jensen wrote:
> > > On Sep 16 14:41, Kevin Wolf wrote:
> > > > Am 14.09.2021 um 22:37 hat Klaus Jensen geschrieben:
> > > > > From: Klaus Jensen <k.jensen@samsung.com>
> > > > >
> > > > > Hi,
> > > > >
> > > > > This is an attempt at adressing a bunch of issues that have presented
> > > > > themselves since we added subsystem support. It's been brewing for a
> > > > > while now.
> > > > >
> > > > > Fundamentally, I've come to the conclusion that modeling namespaces and
> > > > > subsystems as "devices" is wrong. They should have been user-creatable
> > > > > objects. We've run into multiple issues with wrt. hotplugging due to how
> > > > > namespaces hook up to the controller with a bus. The bus-based design
> > > > > made a lot of sense when we didn't have subsystem support and it follows
> > > > > the design of hw/scsi. But, the problem here is that the bus-based
> > > > > design dictates a one parent relationship, and with shared namespaces,
> > > > > that is just not true. If the namespaces are considered to have a single
> > > > > parent, that parent is the subsystem, not any specific controller.
> > > > >
> > > > > This series adds a set of experimental user-creatable objects:
> > > > >
> > > > > -object x-nvme-subsystem
> > > > > -object x-nvme-ns-nvm
> > > > > -object x-nvme-ns-zoned
> > > > >
> > > > > It also adds a new controller device (-device x-nvme-ctrl) that supports
> > > > > these new objects (and gets rid of a bunch of deprecated and confusing
> > > > > parameters). This new approach has a bunch of benefits (other than just
> > > > > fixing the hotplugging issues properly) - we also get support for some
> > > > > nice introspection through some new dynamic properties:
> > > > >
> > > > > (qemu) qom-get /machine/peripheral/nvme-ctrl-1 attached-namespaces
> > > > > [
> > > > > "/objects/nvm-1",
> > > > > "/objects/zns-1"
> > > > > ]
> > > > >
> > > > > (qemu) qom-list /objects/zns-1
> > > > > type (string)
> > > > > subsys (link<x-nvme-subsystem>)
> > > > > nsid (uint32)
> > > > > uuid (string)
> > > > > attached-ctrls (str)
> > > > > eui64 (string)
> > > > > blockdev (string)
> > > > > pi-first (bool)
> > > > > pi-type (NvmeProtInfoType)
> > > > > extended-lba (bool)
> > > > > metadata-size (uint16)
> > > > > lba-size (size)
> > > > > zone-descriptor-extension-size (size)
> > > > > zone-cross-read (bool)
> > > > > zone-max-open (uint32)
> > > > > zone-capacity (size)
> > > > > zone-size (size)
> > > > > zone-max-active (uint32)
> > > > >
> > > > > (qemu) qom-get /objects/zns-1 pi-type
> > > > > "none"
> > > > >
> > > > > (qemu) qom-get /objects/zns-1 eui64
> > > > > "52:54:00:17:67:a0:40:15"
> > > > >
> > > > > (qemu) qom-get /objects/zns-1 zone-capacity
> > > > > 12582912
> > > > >
> > > > > Currently, there are no shortcuts, so you have to define the full
> > > > > topology to get it up and running. Notice that the topology is explicit
> > > > > (the 'subsys' and 'attached-ctrls' links). There are no 'nvme-bus'
> > > > > anymore.
> > > > >
> > > > > -object x-nvme-subsystem,id=subsys0,subnqn=foo
> > > > > -device x-nvme-ctrl,id=nvme-ctrl-0,serial=foo,subsys=subsys0
> > > > > -device x-nvme-ctrl,id=nvme-ctrl-1,serial=bar,subsys=subsys0
> > > > > -drive id=nvm-1,file=nvm-1.img,format=raw,if=none,discard=unmap
> > > > > -object x-nvme-ns-nvm,id=nvm-1,blockdev=nvm-1,nsid=1,subsys=subsys0,attached-ctrls=nvme-ctrl-1
> > > > > -drive id=nvm-2,file=nvm-2.img,format=raw,if=none,discard=unmap
> > > > > -object x-nvme-ns-nvm,id=nvm-2,blockdev=nvm-2,nsid=2,subsys=subsys0,attached-ctrls=nvme-ctrl-0
> > > >
> > > > I may be wrong here, but my first gut feeling when seeing this was that
> > > > referencing the controller device in the namespace object feels
> > > > backwards. Usually, we have objects that are created independently and
> > > > then the devices reference them.
> > > >
> > > > Your need to use a machine_done notifier is probably related to that,
> > > > too, because it goes against the normal initialisation order, so you
> > > > have to wait. Error handling also isn't really possible in the notifier
> > > > any more, so this series seems to just print something to stderr, but
> > > > ignore the error otherwise.
> > > >
> > > > Did you consider passing a list of namespaces to the controller device
> > > > instead?
> > > >
> > > > I guess a problem that you have with both ways is that support for
> > > > list options isn't great in QemuOpts, which is still used both for
> > > > -object and -device in the system emulator...
> > >
> > > Heh. Exactly. The ability to better support lists with -object through
> > > QAPI is why I did it like this...
>
> I see. I really need to continue with the QAPIfication work, in the hope
> that devices will have the same level of support for lists in the
> future...
>
> QOM really has usable support for lists, it's just the outer layer that
> doesn't support them well. I wonder how hard (or how stupid) it would be
> to support JSON syntax without QAPIfying things first.
>
> > > Having the list of namespaces on the controller is preferable. I'll see
> > > what I can come up with.
> >
> > There is also the issue that the x-nvme-ns-nvm -object needs a blockdev
> > - and the ordering is also a problem here. That also requires the
> > machine done notifier.
>
> True, initialisation order can be a mess because QEMU tries to be clever
> and figure it out automatically. But I guess for this part you could
> just have an exception in object_create_early() to switch it to late
> initialisation when blockdevs are already there.
Oh! That's neat! Did not know about that, thanks!
>
> Even when returning false in object_create_early(), this is still before
> device creation, so the question of whether to move the list to the
> device remains relevant.
>
Yeah, but I think we can honestly get rid of the full blown flexibility
and just have all namespaces be attached to all controllers initially.
Should QEMU get better support for lists on qdevs, then we can add it.
I'm leaning towards that.
--
k
prev parent reply other threads:[~2021-09-17 8:08 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-14 20:37 [PATCH RFC 00/13] hw/nvme: experimental user-creatable objects Klaus Jensen
2021-09-14 20:37 ` [PATCH RFC 01/13] hw/nvme: move dif/pi prototypes into dif.h Klaus Jensen
2021-09-14 20:37 ` [PATCH RFC 02/13] hw/nvme: move zns helpers and types into zoned.h Klaus Jensen
2021-09-16 16:06 ` Keith Busch
2021-09-16 16:31 ` Klaus Jensen
2021-09-14 20:37 ` [PATCH RFC 03/13] hw/nvme: move zoned namespace members to separate struct Klaus Jensen
2021-09-14 20:37 ` [PATCH RFC 04/13] hw/nvme: move nvm " Klaus Jensen
2021-09-14 20:37 ` [PATCH RFC 05/13] hw/nvme: move BlockBackend to NvmeNamespaceNvm Klaus Jensen
2021-09-14 20:37 ` [PATCH RFC 06/13] nvme: add structured type for nguid Klaus Jensen
2021-09-14 20:37 ` [PATCH RFC 07/13] hw/nvme: hoist qdev state from namespace Klaus Jensen
2021-09-14 20:37 ` [PATCH RFC 08/13] hw/nvme: hoist qdev state from controller Klaus Jensen
2021-09-14 20:37 ` [PATCH RFC 09/13] hw/nvme: add experimental device x-nvme-ctrl Klaus Jensen
2021-09-14 20:37 ` [PATCH RFC 10/13] hw/nvme: add experimental object x-nvme-subsystem Klaus Jensen
2021-09-14 20:37 ` [PATCH RFC 11/13] hw/nvme: add experimental abstract object x-nvme-ns Klaus Jensen
2021-09-14 20:37 ` [PATCH RFC 12/13] hw/nvme: add experimental objects x-nvme-ns-{nvm, zoned} Klaus Jensen
2021-09-14 20:37 ` [PATCH RFC 13/13] hw/nvme: add attached-namespaces prop Klaus Jensen
2021-09-16 12:41 ` [PATCH RFC 00/13] hw/nvme: experimental user-creatable objects Kevin Wolf
2021-09-16 16:30 ` Klaus Jensen
2021-09-17 6:21 ` Klaus Jensen
2021-09-17 7:38 ` Kevin Wolf
2021-09-17 8:02 ` Klaus Jensen [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YURLnv5wq5XMWzHg@apples.localdomain \
--to=k.jensen@samsung.com \
--cc=armbru@redhat.com \
--cc=hreitz@redhat.com \
--cc=its@irrelevant.dk \
--cc=kbusch@kernel.org \
--cc=kwolf@redhat.com \
--cc=philmd@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).