From: "Richard W.M. Jones" <rjones@redhat.com>
To: Eric Blake <eblake@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>,
vsementsov@virtuozzo.com, qemu-devel@nongnu.org,
"open list:Network Block Dev..." <qemu-block@nongnu.org>,
Max Reitz <mreitz@redhat.com>
Subject: Re: [Qemu-devel] [PATCH] nbd: Advertise multi-conn for shared read-only connections
Date: Thu, 15 Aug 2019 21:00:07 +0100 [thread overview]
Message-ID: <20190815200007.GJ3888@redhat.com> (raw)
In-Reply-To: <20190815185024.7010-1-eblake@redhat.com>
On Thu, Aug 15, 2019 at 01:50:24PM -0500, Eric Blake wrote:
> The NBD specification defines NBD_FLAG_CAN_MULTI_CONN, which can be
> advertised when the server promises cache consistency between
> simultaneous clients (basically, rules that determine what FUA and
> flush from one client are able to guarantee for reads from another
> client). When we don't permit simultaneous clients (such as qemu-nbd
> without -e), the bit makes no sense; and for writable images, we
> probably have a lot more work before we can declare that actions from
> one client are cache-consistent with actions from another. But for
> read-only images, where flush isn't changing any data, we might as
> well advertise multi-conn support. What's more, advertisement of the
> bit makes it easier for clients to determine if 'qemu-nbd -e' was in
> use, where a second connection will succeed rather than hang until the
> first client goes away.
>
> This patch affects qemu as server in advertising the bit. We may want
> to consider patches to qemu as client to attempt parallel connections
> for higher throughput by spreading the load over those connections
> when a server advertises multi-conn, but for now sticking to one
> connection per nbd:// BDS is okay.
>
> See also: https://bugzilla.redhat.com/1708300
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
> docs/interop/nbd.txt | 1 +
> include/block/nbd.h | 2 +-
> blockdev-nbd.c | 2 +-
> nbd/server.c | 4 +++-
> qemu-nbd.c | 2 +-
> 5 files changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/docs/interop/nbd.txt b/docs/interop/nbd.txt
> index fc64473e02b2..6dfec7f47647 100644
> --- a/docs/interop/nbd.txt
> +++ b/docs/interop/nbd.txt
> @@ -53,3 +53,4 @@ the operation of that feature.
> * 2.12: NBD_CMD_BLOCK_STATUS for "base:allocation"
> * 3.0: NBD_OPT_STARTTLS with TLS Pre-Shared Keys (PSK),
> NBD_CMD_BLOCK_STATUS for "qemu:dirty-bitmap:", NBD_CMD_CACHE
> +* 4.2: NBD_FLAG_CAN_MULTI_CONN for sharable read-only exports
> diff --git a/include/block/nbd.h b/include/block/nbd.h
> index 7b36d672f046..991fd52a5134 100644
> --- a/include/block/nbd.h
> +++ b/include/block/nbd.h
> @@ -326,7 +326,7 @@ typedef struct NBDClient NBDClient;
>
> NBDExport *nbd_export_new(BlockDriverState *bs, uint64_t dev_offset,
> uint64_t size, const char *name, const char *desc,
> - const char *bitmap, uint16_t nbdflags,
> + const char *bitmap, uint16_t nbdflags, bool shared,
> void (*close)(NBDExport *), bool writethrough,
> BlockBackend *on_eject_blk, Error **errp);
> void nbd_export_close(NBDExport *exp);
> diff --git a/blockdev-nbd.c b/blockdev-nbd.c
> index 66eebab31875..e5d228771292 100644
> --- a/blockdev-nbd.c
> +++ b/blockdev-nbd.c
> @@ -189,7 +189,7 @@ void qmp_nbd_server_add(const char *device, bool has_name, const char *name,
> }
>
> exp = nbd_export_new(bs, 0, len, name, NULL, bitmap,
> - writable ? 0 : NBD_FLAG_READ_ONLY,
> + writable ? 0 : NBD_FLAG_READ_ONLY, true,
> NULL, false, on_eject_blk, errp);
> if (!exp) {
> return;
> diff --git a/nbd/server.c b/nbd/server.c
> index a2cf085f7635..a602d85070ff 100644
> --- a/nbd/server.c
> +++ b/nbd/server.c
> @@ -1460,7 +1460,7 @@ static void nbd_eject_notifier(Notifier *n, void *data)
>
> NBDExport *nbd_export_new(BlockDriverState *bs, uint64_t dev_offset,
> uint64_t size, const char *name, const char *desc,
> - const char *bitmap, uint16_t nbdflags,
> + const char *bitmap, uint16_t nbdflags, bool shared,
> void (*close)(NBDExport *), bool writethrough,
> BlockBackend *on_eject_blk, Error **errp)
> {
> @@ -1486,6 +1486,8 @@ NBDExport *nbd_export_new(BlockDriverState *bs, uint64_t dev_offset,
> perm = BLK_PERM_CONSISTENT_READ;
> if ((nbdflags & NBD_FLAG_READ_ONLY) == 0) {
> perm |= BLK_PERM_WRITE;
> + } else if (shared) {
> + nbdflags |= NBD_FLAG_CAN_MULTI_CONN;
> }
> blk = blk_new(bdrv_get_aio_context(bs), perm,
> BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED |
> diff --git a/qemu-nbd.c b/qemu-nbd.c
> index 049645491dab..55f5ceaf5c92 100644
> --- a/qemu-nbd.c
> +++ b/qemu-nbd.c
> @@ -1173,7 +1173,7 @@ int main(int argc, char **argv)
> }
>
> export = nbd_export_new(bs, dev_offset, fd_size, export_name,
> - export_description, bitmap, nbdflags,
> + export_description, bitmap, nbdflags, shared > 1,
> nbd_export_closed, writethrough, NULL,
> &error_fatal);
>
Multi-conn is a no-brainer. For nbdkit it more than doubled
throughput:
https://github.com/libguestfs/nbdkit/commit/910a220aa454b410c44731e8d965e92244b536f5
Those results are for loopback mounts of a file located on /dev/shm
and served by nbdkit file plugin, and I would imagine that without the
loop mounting / filesystem overhead the results could be even better.
For read-only connections where the server can handle more than one
connection (-e) it ought to be safe. You have to tell the client how
many connections the server may accept, but that's a limitation of the
current protocol.
So yes ACK, patch makes sense.
Worth noting that fio has NBD support so you can test NBD servers
directly these days:
https://github.com/axboe/fio/commit/d643a1e29d31bf974a613866819dde241c928b6d
https://github.com/axboe/fio/blob/master/examples/nbd.fio#L5
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-df lists disk usage of guests without needing to install any
software inside the virtual machine. Supports Linux and Windows.
http://people.redhat.com/~rjones/virt-df/
next prev parent reply other threads:[~2019-08-15 20:01 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-15 18:50 [Qemu-devel] [PATCH] nbd: Advertise multi-conn for shared read-only connections Eric Blake
2019-08-15 20:00 ` Richard W.M. Jones [this message]
2019-08-15 21:45 ` [Qemu-devel] [Qemu-block] " John Snow
2019-08-15 21:54 ` Eric Blake
2019-08-15 22:02 ` John Snow
2019-08-15 22:36 ` [Qemu-devel] " no-reply
2019-08-16 10:23 ` Vladimir Sementsov-Ogievskiy
2019-08-16 10:47 ` Vladimir Sementsov-Ogievskiy
2019-08-17 14:30 ` Eric Blake
2019-08-18 1:31 ` [Qemu-devel] [Qemu-block] " Nir Soffer
2019-08-19 18:04 ` Eric Blake
2019-08-20 21:19 ` Nir Soffer
2019-08-20 9:07 ` [Qemu-devel] " Vladimir Sementsov-Ogievskiy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190815200007.GJ3888@redhat.com \
--to=rjones@redhat.com \
--cc=eblake@redhat.com \
--cc=kwolf@redhat.com \
--cc=mreitz@redhat.com \
--cc=qemu-block@nongnu.org \
--cc=qemu-devel@nongnu.org \
--cc=vsementsov@virtuozzo.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.