All of lore.kernel.org
 help / color / mirror / Atom feed
From: Kevin Wolf <kwolf@redhat.com>
To: Jonathan Derrick <jonathan.derrick@linux.dev>
Cc: qemu-devel@nongnu.org,
	Michael Kropaczek <michael.kropaczek@solidigm.com>,
	qemu-block@nongnu.org, Keith Busch <kbusch@kernel.org>,
	Klaus Jensen <its@irrelevant.dk>, Hanna Reitz <hreitz@redhat.com>,
	pkrempa@redhat.com, armbru@redhat.com
Subject: Re: [PATCH v4 0/2] hw/nvme: Support for Namespaces Management from guest OS
Date: Mon, 9 Jan 2023 14:29:15 +0100	[thread overview]
Message-ID: <Y7wWq/joPxKqHFfl@redhat.com> (raw)
In-Reply-To: <20221228194141.118-1-jonathan.derrick@linux.dev>

Am 28.12.2022 um 20:41 hat Jonathan Derrick geschrieben:
> Here is the approach:
> The nvme device will get new parameter:
>  - auto-ns-path, which specifies the path to the storage area where back-end
>    image and necessary config files located stored.
> 
> The virtual devices representing namespaces will be created dynamically during
> the Qemu running session following issuance of nvme create-ns and delete-ns
> commands from the guest OS. QOM classes and instances will be created utilizing
> existing configuration scheme used during Qemu's start-up. Back-end image files
> will be neither created nor deleted during Qemu's startup or running session.
> Instead a set of back-end image files and relevant config will be created by
> qemu-img tool with createns sub-command prior to Qemu's session.
> Required parameters are: -S serial number which must match serial parameter of
> qemu-system-xx -device nvme command line specification, -C total capacity, and
> optional -N that will set a maximal limit on number of allowed
> namespaces (default 256) which will be followed by path name pointing to
> storage location corresponding to auto-ns-path of qemu-system-xx -device nvme
> parameter.
> 
> Those created back-end image files will be pre-loaded during Qemu's start-up
> and then during running Qemu's session will be associated or disassociated with
> QOM namespaces virtual instances, as a result of issuing nvme create-ns or
> delete-ns commands. The associated back-end image file for relevant namespace
> will be re-sized as follows: delete-ns command will truncate image file to the
> size of 0, whereas create-ns command will re-size the image file to the size
> provided by nvme create-ns command parameters. Truncating/re-sizing is a result
> of blk_truncate() API which utilizes co-routines and should not block Qemu main
> thread while scheduling AIO operations. It is assured that all settings will
> retain over Qemu's start-ups and shutdowns. The implementation makes it
> possible to combine the existing "Additional Namespace" implementation with the
> new "Managed Namespaces". Those will coexist with obvious restrictions, like
> both will share the same NsIds space, "static" namespaces cannot be deleted or
> if its NsId specified at Qemu's command line will conflicts with previously
> created one by nvme create-ns (and retained), this will lead to an abort of
> Qemu at its start up.

This looks like a valid approach for a proof of concept, but from a
backend perspective, I'm concerned that this approach might be too
limiting and we won't have a good path forward.

For example, how can we integrate this with snapshots? You expect a
specific filename for the image, but taking an external snapshot means
creating an overlay image with a different name.

How do we migrate storage like this? If the management tool (probably
libvirt) knows about all the namespace images and the config file (!),
it can possibly migrate them individually, but note that while a mirror
job is active, images can't be resized any more.

What if we don't want to use a directory on the local filesystem to
store the images, but some network protocol?

It seems to me that we should define proper block layer APIs for
handling namespaces, and then we can have your implementation as one
possible image driver that supports these APIs, for which we can accept
these limitations for now. At least this would already avoid having
backend logic in the device implementation, and allow us to replace it
with something better later without having to change the design of the
device emulation code.

Eventually, I think, if we want to have dynamic namespaces properly
supported, they need to be a feature on the image format level, so that
you could keep all namespaces in a single qcow2 file.

Kevin



      parent reply	other threads:[~2023-01-09 13:39 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-28 19:41 [PATCH v4 0/2] hw/nvme: Support for Namespaces Management from guest OS Jonathan Derrick
2022-12-28 19:41 ` [PATCH v4 1/2] hw/nvme: Support for Namespaces Management from guest OS - create-ns Jonathan Derrick
2022-12-28 20:10   ` Keith Busch
2022-12-29  0:08     ` Michael Kropaczek (CW)
2022-12-28 19:41 ` [PATCH v4 2/2] hw/nvme: Support for Namespaces Management from guest OS - delete-ns Jonathan Derrick
2023-01-09 13:29 ` Kevin Wolf [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y7wWq/joPxKqHFfl@redhat.com \
    --to=kwolf@redhat.com \
    --cc=armbru@redhat.com \
    --cc=hreitz@redhat.com \
    --cc=its@irrelevant.dk \
    --cc=jonathan.derrick@linux.dev \
    --cc=kbusch@kernel.org \
    --cc=michael.kropaczek@solidigm.com \
    --cc=pkrempa@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.