qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Pavel Borzenkov <pborzenkov@virtuozzo.com>
To: Wouter Verhelst <w@uter.be>
Cc: nbd-general@lists.sourceforge.net, Kevin Wolf <kwolf@redhat.com>,
	qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	"Denis V. Lunev" <den@openvz.org>
Subject: Re: [Qemu-devel] [Nbd] [PATCH 1/2] NBD proto: add WRITE_ZEROES extension
Date: Thu, 24 Mar 2016 10:57:06 +0300	[thread overview]
Message-ID: <20160324075706.GA24831@phobos.sw.ru> (raw)
In-Reply-To: <20160323172116.GA2467@grep.be>

On Wed, Mar 23, 2016 at 06:21:16PM +0100, Wouter Verhelst wrote:
> Hi,
> 
> On Wed, Mar 23, 2016 at 05:16:01PM +0300, Denis V. Lunev wrote:
> > From: Pavel Borzenkov <pborzenkov@virtuozzo.com>
> > 
> > There exist some cases when a client knows that the data it is going to
> > write is all zeroes. Such cases include mirroring or backing up a device
> > implemented by a sparse file.
> > 
> > With current NBD command set, the client has to issue NBD_CMD_WRITE
> > command with zeroed payload and transfer these zero bytes through the
> > wire. The server has to write the data onto disk, effectively denying
> > the sparseness.
> > 
> > To remedy this, the patch adds WRITE_ZEROES extension with one new
> > NBD_CMD_WRITE_ZEROES command.
> > 
> > Signed-off-by: Pavel Borzenkov <pborzenkov@virtuozzo.com>
> > Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
> > Signed-off-by: Denis V. Lunev <den@openvz.org>
> > CC: Wouter Verhelst <w@uter.be>
> > CC: Paolo Bonzini <pbonzini@redhat.com>
> > CC: Kevin Wolf <kwolf@redhat.com>
> > CC: Stefan Hajnoczi <stefanha@redhat.com>
> > ---
> >  doc/proto.md | 44 ++++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 44 insertions(+)
> > 
> > diff --git a/doc/proto.md b/doc/proto.md
> > index 463ef8a..cda213c 100644
> > --- a/doc/proto.md
> > +++ b/doc/proto.md
> > @@ -241,6 +241,8 @@ immediately after the global flags field in oldstyle negotiation:
> >    schedule I/O accesses as for a rotational medium
> >  - bit 5, `NBD_FLAG_SEND_TRIM`; should be set to 1 if the server supports
> >    `NBD_CMD_TRIM` commands
> > +- bit 6, `NBD_FLAG_SEND_WRITE_ZEROES`; should be set to 1 if the server
> > +  supports `NBD_CMD_WRITE_ZEROES` commands
> >  
> >  ##### Client flags
> >  
> > @@ -471,6 +473,10 @@ The following request types exist:
> >      about the contents of the export affected by this command, until
> >      overwriting it again with `NBD_CMD_WRITE`.
> >  
> > +* `NBD_CMD_WRITE_ZEROES` (6)
> > +
> > +    Defined by the experimental `WRITE_ZEROES` extension; see below.
> > +
> >  * Other requests
> >  
> >      Some third-party implementations may require additional protocol
> > @@ -594,6 +600,44 @@ option reply type.
> >        message if they do not also send it as a reply to the
> >        `NBD_OPT_SELECT` message.
> >  
> > +### `WRITE_ZEROES` extension
> > +
> > +There exist some cases when a client knows that the data it is going to write
> > +is all zeroes. Such cases include mirroring or backing up a device implemented
> > +by a sparse file. With current NBD command set, the client has to issue
> > +`NBD_CMD_WRITE` command with zeroed payload and transfer these zero bytes
> > +through the wire. The server has to write the data onto disk, effectively
> > +denying the sparseness.
> > +
> > +To remedy this, a `WRITE_ZEROES` extension is envisioned. This extension adds
> > +one new command with two command flags.
> > +
> > +* `NBD_CMD_WRITE_ZEROES` (6)
> > +
> > +    A write request with no payload. Length and offset define the location
> > +    and amount of data to be zeroed.
> > +
> > +    The server MUST zero out the data on disk, and then send the reply
> > +    message. The server MAY send the reply message before the data has
> > +    reached permanent storage.
> > +
> > +    If the `NBD_FLAG_SEND_FUA` flag ("Force Unit Access") was set in the
> > +    export flags field, the client MAY set the flag `NBD_CMD_FLAG_FUA` (bit 0)
> > +    in the command flags field. If this flag was set, the server MUST NOT send
> > +    the reply until it has ensured that the newly-zeroed data has reached
> > +    permanent storage.
> > +
> > +    If the flag `NBD_CMD_FLAG_MAY_TRIM` (bit 1) was set by the client in the
> > +    command flags field, the server MAY use trimming to zero out the area,
> > +    but it MUST ensure that the data reads back as zero.
> > +
> > +    If an error occurs, the server SHOULD set the appropriate error code
> > +    in the error field. The server MAY then close the connection.
> > +
> > +The server SHOULD return `ENOSPC` if it receives a write zeroes request
> > +including one or more sectors beyond the size of the device. It SHOULD
> > +return `EPERM` if it receives a write zeroes request on a read-only export.
> > +
> 
> So, the semantics of your proposed WRITE_ZEROES are exactly the same as
> the WRITE command, except that no payload is sent?
> 
> In that case, I think it's slightly more sensible if we don't add a new
> command, but instead just have an NBD_CMD_FLAG_ZEROES added to the WRITE
> command instead. After all, they're going to be (mostly) the same
> anyway.
> 
> Did you propose a separate command for a specific reason that I'm
> missing (or forgetting), or is that just an oversight?

No, there is no specific reason. Looks like NBD_CMD_FLAG_ZEROES fits the
spec and implementations nicely. So I'll rewrite the extension and add
the flag instead of the whole command.

> 
> -- 
> < ron> I mean, the main *practical* problem with C++, is there's like a dozen
>        people in the world who think they really understand all of its rules,
>        and pretty much all of them are just lying to themselves too.
>  -- #debian-devel, OFTC, 2016-02-12
> 
> ------------------------------------------------------------------------------
> Transform Data into Opportunity.
> Accelerate data analysis in your applications with
> Intel Data Analytics Acceleration Library.
> Click to learn more.
> http://pubads.g.doubleclick.net/gampad/clk?id=278785351&iu=/4140
> _______________________________________________
> Nbd-general mailing list
> Nbd-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nbd-general

  reply	other threads:[~2016-03-24  8:31 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-23 14:16 [Qemu-devel] [PATCH 0/2] NBD protocol extensions: WRITE_ZEROES and GET_LBA_STATUS Denis V. Lunev
2016-03-23 14:16 ` [Qemu-devel] [PATCH 1/2] NBD proto: add WRITE_ZEROES extension Denis V. Lunev
2016-03-23 15:14   ` Eric Blake
2016-03-23 17:40     ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-24  7:16     ` [Qemu-devel] " Pavel Borzenkov
2016-03-24  7:36       ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-23 17:21   ` Wouter Verhelst
2016-03-24  7:57     ` Pavel Borzenkov [this message]
2016-03-24  8:26       ` Wouter Verhelst
2016-03-24 11:35         ` Pavel Borzenkov
2016-03-24 11:37         ` Paolo Bonzini
2016-03-24 12:31           ` Wouter Verhelst
2016-03-24 14:53         ` Eric Blake
2016-03-23 14:16 ` [Qemu-devel] [PATCH 2/2] NBD proto: add GET_LBA_STATUS extension Denis V. Lunev
2016-03-23 16:27   ` Eric Blake
2016-03-24 12:30     ` Pavel Borzenkov
2016-03-24 15:04       ` Eric Blake
2016-03-24 16:36         ` Pavel Borzenkov
2016-03-23 17:58   ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-23 18:14     ` Kevin Wolf
2016-03-24  8:25       ` Pavel Borzenkov
2016-03-24  8:41         ` Wouter Verhelst
2016-03-24 11:36           ` Pavel Borzenkov
2016-03-24 12:32             ` Wouter Verhelst
2016-03-24  8:43     ` Pavel Borzenkov
2016-03-24  9:33       ` Wouter Verhelst
2016-03-24 10:32         ` Alex Bligh
2016-03-24 11:58           ` Paolo Bonzini
2016-03-24 12:17             ` Alex Bligh
2016-03-24 12:32               ` Paolo Bonzini
2016-03-24 13:31                 ` Alex Bligh
2016-03-24 13:32                   ` Paolo Bonzini
2016-03-24 11:55     ` Paolo Bonzini
2016-03-24 12:43       ` Wouter Verhelst
2016-03-24 15:25       ` Eric Blake
2016-03-24 15:33         ` Paolo Bonzini
2016-03-24 15:53           ` Wouter Verhelst
2016-03-24 16:04             ` Eric Blake
2016-03-24 16:07               ` Kevin Wolf
2016-03-24 16:47                 ` Wouter Verhelst
2016-03-29  9:38                   ` Kevin Wolf
2016-03-29  9:53                     ` Wouter Verhelst
2016-03-29 10:25                     ` Paolo Bonzini
2016-03-24 22:08   ` [Qemu-devel] " Eric Blake
2016-03-25  8:49     ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-25  9:01       ` Alex Bligh
2016-03-28 15:58       ` Eric Blake
2016-04-04 10:32         ` Markus Pargmann
2016-04-04 10:18       ` Markus Pargmann
2016-04-04 16:54         ` Eric Blake
2016-04-04 22:17         ` Wouter Verhelst
2016-04-04 16:40   ` [Qemu-devel] " Eric Blake
2016-04-04 20:16   ` Denis V. Lunev
2016-04-04 20:36     ` [Qemu-devel] [Nbd] " Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160324075706.GA24831@phobos.sw.ru \
    --to=pborzenkov@virtuozzo.com \
    --cc=den@openvz.org \
    --cc=kwolf@redhat.com \
    --cc=nbd-general@lists.sourceforge.net \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=w@uter.be \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).