qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: qemu-devel@nongnu.org
Cc: qemu-block@nongnu.org, alex@alex.org.uk,
	Paolo Bonzini <pbonzini@redhat.com>,
	Kevin Wolf <kwolf@redhat.com>, Max Reitz <mreitz@redhat.com>
Subject: [Qemu-devel] [PATCH v3 43/44] nbd: Implement NBD_OPT_BLOCK_SIZE on server
Date: Fri, 22 Apr 2016 17:40:51 -0600	[thread overview]
Message-ID: <1461368452-10389-44-git-send-email-eblake@redhat.com> (raw)
In-Reply-To: <1461368452-10389-1-git-send-email-eblake@redhat.com>

The upstream NBD Protocol has defined a new extension to allow
the server to advertise block sizes to the client, as well as
a way for the client to inform the server that it intends to
obey block sizes.

Thanks to a recent fix, our minimum transfer size is always
1 (the block layer takes care of read-modify-write on our
behalf), although if we wanted down the road, we could
advertise a minimum of 512 based on our usage patterns to a
client that is willing to honor block sizes.  Meanwhile,
advertising our absolute maximum transfer size of 32M will
help newer clients avoid EINVAL failures.

We do not reject clients for using the older NBD_OPT_EXPORT_NAME;
we are no worse off for those clients than we used to be. But
for clients new enough to use NBD_OPT_GO, we require the client
to first send us NBD_OPT_BLOCK_SIZE to prove they know about
sizing restraints, by failing with NBD_REP_ERR_BLOCK_SIZE_REQD.
All existing released qemu clients (whether old-style or new, at
least by the end of this series) already honor our limits, and
will still connect; so at most, this would reject a new non-qemu
client that doesn't intend to obey limits (and that client could
still use NBD_OPT_EXPORT_NAME to bypass our rejection).

Signed-off-by: Eric Blake <eblake@redhat.com>
---
 include/block/nbd.h |  2 ++
 nbd/nbd-internal.h  |  1 +
 nbd/server.c        | 62 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 65 insertions(+)

diff --git a/include/block/nbd.h b/include/block/nbd.h
index 1072d9e..a5c68df 100644
--- a/include/block/nbd.h
+++ b/include/block/nbd.h
@@ -96,11 +96,13 @@ typedef struct nbd_reply nbd_reply;
 #define NBD_REP_ERR_TLS_REQD    NBD_REP_ERR(5)  /* TLS required */
 #define NBD_REP_ERR_UNKNOWN     NBD_REP_ERR(6)  /* Export unknown */
 #define NBD_REP_ERR_SHUTDOWN    NBD_REP_ERR(7)  /* Server shutting down */
+#define NBD_REP_ERR_BLOCK_SIZE_REQD NBD_REP_ERR(8) /* Missing OPT_BLOCK_SIZE */

 /* Info types, used during NBD_REP_INFO */
 #define NBD_INFO_EXPORT         0
 #define NBD_INFO_NAME           1
 #define NBD_INFO_DESCRIPTION    2
+#define NBD_INFO_BLOCK_SIZE     3

 /* Request flags, sent from client to server during transmission phase */
 #define NBD_CMD_FLAG_FUA        (1 << 0) /* 'force unit access' during write */
diff --git a/nbd/nbd-internal.h b/nbd/nbd-internal.h
index 1935102..1354182 100644
--- a/nbd/nbd-internal.h
+++ b/nbd/nbd-internal.h
@@ -85,6 +85,7 @@
 #define NBD_OPT_STARTTLS        (5)
 #define NBD_OPT_INFO            (6)
 #define NBD_OPT_GO              (7)
+#define NBD_OPT_BLOCK_SIZE      (9)

 /* NBD errors are based on errno numbers, so there is a 1:1 mapping,
  * but only a limited set of errno values is specified in the protocol.
diff --git a/nbd/server.c b/nbd/server.c
index 563afb2..86d1e2d 100644
--- a/nbd/server.c
+++ b/nbd/server.c
@@ -83,6 +83,7 @@ struct NBDClient {
     void (*close)(NBDClient *client);

     bool no_zeroes;
+    bool block_size;
     NBDExport *exp;
     QCryptoTLSCreds *tlscreds;
     char *tlsaclname;
@@ -346,6 +347,7 @@ static int nbd_negotiate_handle_info(NBDClient *client, uint32_t length,
     uint16_t type;
     uint64_t size;
     uint16_t flags;
+    uint32_t block;

     /* Client sends:
         [20 ..  xx]   export name (length bytes)
@@ -391,6 +393,57 @@ static int nbd_negotiate_handle_info(NBDClient *client, uint32_t length,
     }

     rc = nbd_negotiate_send_rep_len(client->ioc, NBD_REP_INFO, opt,
+                                    sizeof(type) + 3 * sizeof(block));
+    if (rc < 0) {
+        return rc;
+    }
+
+    type = cpu_to_be16(NBD_INFO_BLOCK_SIZE);
+    if (nbd_negotiate_write(client->ioc, &type, sizeof(type)) !=
+        sizeof(type)) {
+        LOG("write failed");
+        return -EIO;
+    }
+    /* minimum - Always 1, because we use blk_pread().
+     * TODO: Advertise 512 if guest used NBD_OPT_BLOCK_SIZE? */
+    block = cpu_to_be32(1);
+    if (nbd_negotiate_write(client->ioc, &block, sizeof(block)) !=
+        sizeof(block)) {
+        LOG("write failed");
+        return -EIO;
+    }
+    /* preferred - At least 4096, but larger as appropriate. */
+    block = blk_get_opt_transfer_length(exp->blk) * BDRV_SECTOR_SIZE;
+    block = cpu_to_be32(MAX(4096, block));
+    if (nbd_negotiate_write(client->ioc, &block, sizeof(block)) !=
+        sizeof(block)) {
+        LOG("write failed");
+        return -EIO;
+    }
+    /* maximum - At most 32M, but smaller as appropriate. */
+    block = blk_get_max_transfer_length(exp->blk);
+    if (block && block < NBD_MAX_BUFFER_SIZE / BDRV_SECTOR_SIZE) {
+        block *= BDRV_SECTOR_SIZE;
+    } else {
+        block = NBD_MAX_BUFFER_SIZE;
+    }
+    block = cpu_to_be32(block);
+    if (nbd_negotiate_write(client->ioc, &block, sizeof(block)) !=
+        sizeof(block)) {
+        LOG("write failed");
+        return -EIO;
+    }
+
+    if (!client->block_size) {
+        /* The client is new enough to use NBD_OPT_GO, but forgot to
+         * tell us that it plans to obey block sizes. Since we fail
+         * hard on oversize requests, it's better to reject such a
+         * client up front.  */
+        return nbd_negotiate_send_rep(client->ioc, NBD_REP_ERR_BLOCK_SIZE_REQD,
+                                      opt);
+    }
+
+    rc = nbd_negotiate_send_rep_len(client->ioc, NBD_REP_INFO, opt,
                                     sizeof(type) + sizeof(size) +
                                     sizeof(flags));
     if (rc < 0) {
@@ -630,6 +683,15 @@ static int nbd_negotiate_options(NBDClient *client, uint16_t myflags)
                 }
                 break;

+            case NBD_OPT_BLOCK_SIZE:
+                client->block_size = true;
+                ret = nbd_negotiate_send_rep(client->ioc, NBD_REP_ACK,
+                                             clientflags);
+                if (ret < 0) {
+                    return ret;
+                }
+                break;
+
             case NBD_OPT_STARTTLS:
                 if (nbd_negotiate_drop_sync(client->ioc, length) != length) {
                     return -EIO;
-- 
2.5.5

  parent reply	other threads:[~2016-04-22 23:41 UTC|newest]

Thread overview: 67+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-22 23:40 [Qemu-devel] [PATCH v3 00/44] NBD protocol additions Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 01/44] nbd: More debug typo fixes, use correct formats Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 02/44] nbd: Quit server after any write error Eric Blake
2016-04-25  9:21   ` Alex Bligh
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 03/44] nbd: Improve server handling of bogus commands Eric Blake
2016-04-25  9:29   ` Alex Bligh
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 04/44] nbd: Reject unknown request flags Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 05/44] nbd: Group all Linux-specific ioctl code in one place Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 06/44] nbd: Clean up ioctl handling of qemu-nbd -c Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 07/44] nbd: Limit nbdflags to 16 bits Eric Blake
2016-04-25  9:24   ` Alex Bligh
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 08/44] nbd: Add qemu-nbd -D for human-readable description Eric Blake
2016-04-25  9:25   ` Alex Bligh
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 09/44] block: Allow BDRV_REQ_FUA through blk_pwrite() Eric Blake
2016-04-23  8:12   ` Denis V. Lunev
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 10/44] fdc: Switch to byte-based block access Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 11/44] nand: " Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 12/44] onenand: " Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 13/44] pflash: " Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 14/44] sd: " Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 15/44] m25p80: " Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 16/44] atapi: " Eric Blake
2016-04-25 21:36   ` John Snow
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 17/44] nbd: " Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 18/44] qemu-img: " Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 19/44] qemu-io: " Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 20/44] block: Switch blk_read_unthrottled() to byte interface Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 21/44] block: Switch blk_write_zeroes() " Eric Blake
2016-04-23  8:12   ` Denis V. Lunev
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 22/44] block: Kill blk_write(), blk_read() Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 23/44] qemu-io: Add missing option documentation Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 24/44] qemu-io: Add 'write -f' to test FUA flag Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 25/44] qemu-io: Add 'open -u' to set BDRV_O_UNMAP after the fact Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 26/44] qemu-io: Add 'write -z -u' to test MAY_UNMAP flag Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 27/44] nbd: Use BDRV_REQ_FUA for better FUA where supported Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 28/44] nbd: Detect servers that send unexpected error values Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 29/44] nbd: Avoid magic number for NBD max name size Eric Blake
2016-04-25  9:32   ` Alex Bligh
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 30/44] nbd: Treat flags vs. command type as separate fields Eric Blake
2016-04-25  9:34   ` Alex Bligh
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 31/44] nbd: Share common reply-sending code in server Eric Blake
2016-04-25  9:34   ` Alex Bligh
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 32/44] nbd: Share common option-sending code in client Eric Blake
2016-04-25  9:37   ` Alex Bligh
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 33/44] nbd: Let client skip portions of server reply Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 34/44] nbd: Less allocation during NBD_OPT_LIST Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 35/44] nbd: Support shorter handshake Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 36/44] nbd: Improve handling of shutdown requests Eric Blake
2016-04-25  9:47   ` Alex Bligh
2016-04-25 19:20     ` Eric Blake
2016-04-25 19:40       ` Alex Bligh
2016-04-25 19:48         ` Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 37/44] nbd: Create struct for tracking export info Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 38/44] block: Add blk_get_opt_transfer_length() Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 39/44] nbd: Implement NBD_OPT_GO on server Eric Blake
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 40/44] nbd: Implement NBD_OPT_GO on client Eric Blake
2016-04-25 10:31   ` Alex Bligh
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 41/44] nbd: Implement NBD_CMD_WRITE_ZEROES on server Eric Blake
2016-04-23  9:00   ` Pavel Borzenkov
2016-04-25 12:11   ` Alex Bligh
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 42/44] nbd: Implement NBD_CMD_WRITE_ZEROES on client Eric Blake
2016-04-25 12:12   ` Alex Bligh
2016-04-22 23:40 ` Eric Blake [this message]
2016-04-25 12:16   ` [Qemu-devel] [PATCH v3 43/44] nbd: Implement NBD_OPT_BLOCK_SIZE on server Alex Bligh
2016-04-22 23:40 ` [Qemu-devel] [PATCH v3 44/44] nbd: Implement NBD_OPT_BLOCK_SIZE on client Eric Blake
2016-04-25 12:19   ` Alex Bligh
2016-04-25 19:16     ` Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1461368452-10389-44-git-send-email-eblake@redhat.com \
    --to=eblake@redhat.com \
    --cc=alex@alex.org.uk \
    --cc=kwolf@redhat.com \
    --cc=mreitz@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-block@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).