qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>, Wouter Verhelst <w@uter.be>,
	"Denis V. Lunev" <den@openvz.org>
Cc: nbd-general@lists.sourceforge.net, Kevin Wolf <kwolf@redhat.com>,
	qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>
Subject: Re: [Qemu-devel] [Nbd] [PATCH 2/2] NBD proto: add GET_LBA_STATUS extension
Date: Thu, 24 Mar 2016 09:25:27 -0600	[thread overview]
Message-ID: <56F406E7.4010207@redhat.com> (raw)
In-Reply-To: <56F3D5C7.9070007@redhat.com>

[-- Attachment #1: Type: text/plain, Size: 4032 bytes --]

On 03/24/2016 05:55 AM, Paolo Bonzini wrote:
>> As Eric noted, please expand LBA at least once.
> 
> Let's just use "block" (e.g. NBD_CMD_GET_BLOCK_STATUS).

Yes, avoiding the term LBA and using BLOCK everywhere also nicely solves
the problem of introducing yet more terminology.

> 
>>> +      - 32 bits, length of parameter data that follow (unsigned)
>>> +      - zero or more LBA status descriptors, each having the following
>>> +        structure:
>>> +
>>> +        * 64 bits, offset (unsigned)
>>> +        * 32 bits, length (unsigned)
>>> +        * 16 bits, status (unsigned)
>>> +
>>> +    unless an error condition has occurred.
>>> +
> 
> Can we just return one descriptor?  That would simplify the protocol a bit.

As in, the return is exactly one descriptor, consisting of:

* 32 bits, length (unsigned): must be > 0, <= the client's length
* 16 bits, status (unsigned): status of that block

Of course, it means more traffic. The nice part about returning an array
of descriptors is that I can learn the status of 1G of the file, even if
the file alternates every 512 bytes between extent status, in just one
client call. But returning only a single descriptor at a time means I'd
have to make 2M client calls to learn the same pattern of allocation.
Fortunately, in the common case, allocation patterns tend to not be that
disjoint.

On the other hand, returning only one descriptor at a time (for possibly
less length than the client requested) may be easier when using
lseek(SEEK_DATA/HOLE) as the mechanism for determining the bounds of
each extent, since the server only has to search once per command,
instead of dynamically construct the entire reply.

I don't have any strong opinions on which would be better, but it is
definitely food for thought.

> 
> However, let's make these bits, so that
> 
> NBD_STATE_ALLOCATED (0x1), LBA extent is present on the block device
> NBD_STATE_ZERO (0x2), LBA extent will read as zeroes

Should we flip the sense and call this NBD_STATE_UNALLOCATED (0 means
allocated, 1 means not present), so that an overall status of 0 is a
safe default?  (That is, it should always be safe to state a sector is
allocated when it is not, and always safe to state a sector is not known
to read as zeroes even if that happens to be its contents - all that we
lose by reporting this safe default state is that the client will be
unable to optimize for skipping holes).

>> Either the spec should define what it means for a block to be in a dirty
>> state, or it should not talk about it.
> 
> Here is my attempt:
> 
>     This command is meant to operate in tandem with other (non-NBD)
>     channels to the server.  Generally, a "dirty" block is a block that
>     has been written to by someone, but the exact meaning of "has been
>     written" is left to the implementation.  For example, a virtual
>     machine monitor could provide a (non-NBD) command to start tracking
>     blocks written by the virtual machine.  A backup client then can
>     connect to an NBD server provided by the virtual machine monitor
>     and use NBD_CMD_GET_BLOCK_STATUS only read blocks that the virtual

s/only/to only/

>     machine has changed.

s/changed/changed since it started tracking/

> 
>     An implementation that doesn't track the "dirtiness" state of blocks
>     MUST either fail this command with EINVAL, or mark all blocks as
>     dirty in the descriptor that it returns.

Is it feasible to return zero/allocated/dirty status all at the same
time, or do we want to strictly require two different modes of
operation?  That is, if we are returning zero and allocated as two bits,
can we also return a third bit for dirty/clean?  Should we flip the
sense of the bit, where 0 means dirty and 1 means clean, again so that a
server can always return a status of 0 as the safe default?

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

  parent reply	other threads:[~2016-03-24 15:25 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-23 14:16 [Qemu-devel] [PATCH 0/2] NBD protocol extensions: WRITE_ZEROES and GET_LBA_STATUS Denis V. Lunev
2016-03-23 14:16 ` [Qemu-devel] [PATCH 1/2] NBD proto: add WRITE_ZEROES extension Denis V. Lunev
2016-03-23 15:14   ` Eric Blake
2016-03-23 17:40     ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-24  7:16     ` [Qemu-devel] " Pavel Borzenkov
2016-03-24  7:36       ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-23 17:21   ` Wouter Verhelst
2016-03-24  7:57     ` Pavel Borzenkov
2016-03-24  8:26       ` Wouter Verhelst
2016-03-24 11:35         ` Pavel Borzenkov
2016-03-24 11:37         ` Paolo Bonzini
2016-03-24 12:31           ` Wouter Verhelst
2016-03-24 14:53         ` Eric Blake
2016-03-23 14:16 ` [Qemu-devel] [PATCH 2/2] NBD proto: add GET_LBA_STATUS extension Denis V. Lunev
2016-03-23 16:27   ` Eric Blake
2016-03-24 12:30     ` Pavel Borzenkov
2016-03-24 15:04       ` Eric Blake
2016-03-24 16:36         ` Pavel Borzenkov
2016-03-23 17:58   ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-23 18:14     ` Kevin Wolf
2016-03-24  8:25       ` Pavel Borzenkov
2016-03-24  8:41         ` Wouter Verhelst
2016-03-24 11:36           ` Pavel Borzenkov
2016-03-24 12:32             ` Wouter Verhelst
2016-03-24  8:43     ` Pavel Borzenkov
2016-03-24  9:33       ` Wouter Verhelst
2016-03-24 10:32         ` Alex Bligh
2016-03-24 11:58           ` Paolo Bonzini
2016-03-24 12:17             ` Alex Bligh
2016-03-24 12:32               ` Paolo Bonzini
2016-03-24 13:31                 ` Alex Bligh
2016-03-24 13:32                   ` Paolo Bonzini
2016-03-24 11:55     ` Paolo Bonzini
2016-03-24 12:43       ` Wouter Verhelst
2016-03-24 15:25       ` Eric Blake [this message]
2016-03-24 15:33         ` Paolo Bonzini
2016-03-24 15:53           ` Wouter Verhelst
2016-03-24 16:04             ` Eric Blake
2016-03-24 16:07               ` Kevin Wolf
2016-03-24 16:47                 ` Wouter Verhelst
2016-03-29  9:38                   ` Kevin Wolf
2016-03-29  9:53                     ` Wouter Verhelst
2016-03-29 10:25                     ` Paolo Bonzini
2016-03-24 22:08   ` [Qemu-devel] " Eric Blake
2016-03-25  8:49     ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-25  9:01       ` Alex Bligh
2016-03-28 15:58       ` Eric Blake
2016-04-04 10:32         ` Markus Pargmann
2016-04-04 10:18       ` Markus Pargmann
2016-04-04 16:54         ` Eric Blake
2016-04-04 22:17         ` Wouter Verhelst
2016-04-04 16:40   ` [Qemu-devel] " Eric Blake
2016-04-04 20:16   ` Denis V. Lunev
2016-04-04 20:36     ` [Qemu-devel] [Nbd] " Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56F406E7.4010207@redhat.com \
    --to=eblake@redhat.com \
    --cc=den@openvz.org \
    --cc=kwolf@redhat.com \
    --cc=nbd-general@lists.sourceforge.net \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=w@uter.be \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).