From: Wouter Verhelst <w@uter.be>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: nbd-general@lists.sourceforge.net,
"Denis V. Lunev" <den@openvz.org>,
Stefan Hajnoczi <stefanha@redhat.com>,
qemu-devel@nongnu.org, Kevin Wolf <kwolf@redhat.com>
Subject: Re: [Qemu-devel] [Nbd] [PATCH 2/2] NBD proto: add GET_LBA_STATUS extension
Date: Thu, 24 Mar 2016 13:43:37 +0100 [thread overview]
Message-ID: <20160324124337.GH2870@grep.be> (raw)
In-Reply-To: <56F3D5C7.9070007@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 4593 bytes --]
Hi Paolo,
On Thu, Mar 24, 2016 at 12:55:51PM +0100, Paolo Bonzini wrote:
>
>
> On 23/03/2016 18:58, Wouter Verhelst wrote:
> >> +To provide such class of information, `GET_LBA_STATUS` extension adds new
> >> +`NBD_CMD_GET_LBA_STATUS` command which returns a list of LBA ranges with
> >> +their respective states.
> >> +
> >> +* `NBD_CMD_GET_LBA_STATUS` (7)
> >> +
> >> + An LBA range status query request. Length and offset define the range
> >> + of interest. The server MUST reply with a reply header, followed
> >> + immediately by the following data:
> >
> > As Eric noted, please expand LBA at least once.
>
> Let's just use "block" (e.g. NBD_CMD_GET_BLOCK_STATUS).
That works too :-)
[...]
> > Also, this may end up being a fairly expensive call for the server to
> > process. Is it really useful?
>
> It's always okay for the server to omit NBD_STATE_ZERO, but it can be
> useful if the state is known for some reason. For example, file system
> holes are always zero, but unallocated blocks on a block device are not
> always zero.
>
> However, let's make these bits, so that
>
> NBD_STATE_ALLOCATED (0x1), LBA extent is present on the block device
> NBD_STATE_ZERO (0x2), LBA extent will read as zeroes
>
> and you can have NBD_STATE_ALLOCATED|NBD_STATE_ZERO. File systems do
> have the concept of unwritten extents which would be represented like
> that. The API to access the information (the FIEMAP ioctl) is broken,
> but perhaps in the future a non-broken API could be added---for example
> SEEK_ZERO and SEEK_NONZERO values for lseek's "whence" argument.
Okay, that works for me.
> >> + - `NBD_STATE_DEALLOCATED` (0x2), LBA extent is not present on the
> >> + block device. A client MUST NOT make any assumptions about the
> >> + contents of the extent.
> >> +
> >> + 2. Block dirtiness state
> >> +
> >> + Upon receiving an `NBD_CMD_GET_LBA_STATUS` command with command flags
> >> + field set to `NBD_FLAG_GET_DIRTY` (0x1), the server MUST return
> >> + the dirtiness status of the device. The following dirtiness states
> >> + are defined for the command:
> >> +
> >> + - `NBD_STATE_DIRTY` (0x0), LBA extent is dirty;
> >> + - `NBD_STATE_CLEAN` (0x1), LBA extent is clean.
> >> +
> >> + Generic NBD client implementation without knowledge of a particular NBD
> >> + server operation MUST NOT make any assumption on the meaning of the
> >> + NBD_STATE_DIRTY or NBD_STATE_CLEAN states.
> >
> > That makes it a useless call. A server can read /dev/random to decide
> > whether to send STATE_DIRTY or STATE_CLEAN, and still be compliant with
> > this spec.
> >
> > Either the spec should define what it means for a block to be in a dirty
> > state, or it should not talk about it.
>
> Here is my attempt:
>
> This command is meant to operate in tandem with other (non-NBD)
> channels to the server. Generally, a "dirty" block is a block that
> has been written to by someone, but the exact meaning of "has been
> written" is left to the implementation. For example, a virtual
> machine monitor could provide a (non-NBD) command to start tracking
> blocks written by the virtual machine. A backup client then can
> connect to an NBD server provided by the virtual machine monitor
> and use NBD_CMD_GET_BLOCK_STATUS only read blocks that the virtual
^ to
> machine has changed.
>
> An implementation that doesn't track the "dirtiness" state of blocks
> MUST either fail this command with EINVAL, or mark all blocks as
> dirty in the descriptor that it returns.
That seems saner, yes -- and I'm starting to understand what the
rationale is for this protocol message :-)
I suppose I could also implement that in nbd-server to send out
information about changed blocks if the copy-on-write option has been
switched on.
It might also be possible to add an in-protocol message to start
tracking changes (e.g., a "create checkpoint" message), but I'm not sure
if that's worth it (and it could massively complicate the NBD state
machine and protocol; not sure whether that's worth it)
At any rate, your suggestion does alleviate my concerns.
--
< ron> I mean, the main *practical* problem with C++, is there's like a dozen
people in the world who think they really understand all of its rules,
and pretty much all of them are just lying to themselves too.
-- #debian-devel, OFTC, 2016-02-12
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 819 bytes --]
next prev parent reply other threads:[~2016-03-24 15:54 UTC|newest]
Thread overview: 54+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-03-23 14:16 [Qemu-devel] [PATCH 0/2] NBD protocol extensions: WRITE_ZEROES and GET_LBA_STATUS Denis V. Lunev
2016-03-23 14:16 ` [Qemu-devel] [PATCH 1/2] NBD proto: add WRITE_ZEROES extension Denis V. Lunev
2016-03-23 15:14 ` Eric Blake
2016-03-23 17:40 ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-24 7:16 ` [Qemu-devel] " Pavel Borzenkov
2016-03-24 7:36 ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-23 17:21 ` Wouter Verhelst
2016-03-24 7:57 ` Pavel Borzenkov
2016-03-24 8:26 ` Wouter Verhelst
2016-03-24 11:35 ` Pavel Borzenkov
2016-03-24 11:37 ` Paolo Bonzini
2016-03-24 12:31 ` Wouter Verhelst
2016-03-24 14:53 ` Eric Blake
2016-03-23 14:16 ` [Qemu-devel] [PATCH 2/2] NBD proto: add GET_LBA_STATUS extension Denis V. Lunev
2016-03-23 16:27 ` Eric Blake
2016-03-24 12:30 ` Pavel Borzenkov
2016-03-24 15:04 ` Eric Blake
2016-03-24 16:36 ` Pavel Borzenkov
2016-03-23 17:58 ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-23 18:14 ` Kevin Wolf
2016-03-24 8:25 ` Pavel Borzenkov
2016-03-24 8:41 ` Wouter Verhelst
2016-03-24 11:36 ` Pavel Borzenkov
2016-03-24 12:32 ` Wouter Verhelst
2016-03-24 8:43 ` Pavel Borzenkov
2016-03-24 9:33 ` Wouter Verhelst
2016-03-24 10:32 ` Alex Bligh
2016-03-24 11:58 ` Paolo Bonzini
2016-03-24 12:17 ` Alex Bligh
2016-03-24 12:32 ` Paolo Bonzini
2016-03-24 13:31 ` Alex Bligh
2016-03-24 13:32 ` Paolo Bonzini
2016-03-24 11:55 ` Paolo Bonzini
2016-03-24 12:43 ` Wouter Verhelst [this message]
2016-03-24 15:25 ` Eric Blake
2016-03-24 15:33 ` Paolo Bonzini
2016-03-24 15:53 ` Wouter Verhelst
2016-03-24 16:04 ` Eric Blake
2016-03-24 16:07 ` Kevin Wolf
2016-03-24 16:47 ` Wouter Verhelst
2016-03-29 9:38 ` Kevin Wolf
2016-03-29 9:53 ` Wouter Verhelst
2016-03-29 10:25 ` Paolo Bonzini
2016-03-24 22:08 ` [Qemu-devel] " Eric Blake
2016-03-25 8:49 ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-25 9:01 ` Alex Bligh
2016-03-28 15:58 ` Eric Blake
2016-04-04 10:32 ` Markus Pargmann
2016-04-04 10:18 ` Markus Pargmann
2016-04-04 16:54 ` Eric Blake
2016-04-04 22:17 ` Wouter Verhelst
2016-04-04 16:40 ` [Qemu-devel] " Eric Blake
2016-04-04 20:16 ` Denis V. Lunev
2016-04-04 20:36 ` [Qemu-devel] [Nbd] " Eric Blake
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160324124337.GH2870@grep.be \
--to=w@uter.be \
--cc=den@openvz.org \
--cc=kwolf@redhat.com \
--cc=nbd-general@lists.sourceforge.net \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
--cc=stefanha@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).