qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Eric Blake <eblake@redhat.com>
To: Wouter Verhelst <w@uter.be>
Cc: nbd-general@lists.sourceforge.net, Kevin Wolf <kwolf@redhat.com>,
	qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Markus Pargmann <mpa@pengutronix.de>,
	"Denis V. Lunev" <den@openvz.org>
Subject: Re: [Qemu-devel] [Nbd] [PATCH 2/2] NBD proto: add GET_LBA_STATUS extension
Date: Mon, 28 Mar 2016 09:58:52 -0600	[thread overview]
Message-ID: <56F954BC.5010606@redhat.com> (raw)
In-Reply-To: <20160325084929.GA2671@grep.be>

[-- Attachment #1: Type: text/plain, Size: 4659 bytes --]

On 03/25/2016 02:49 AM, Wouter Verhelst wrote:

>> You may also want to add a rule that for all future extensions, any
>> command that requires data in the server response (other than the server
>> response to NBD_CMD_READ) must include a 32-bit length as the first
>> field of its data payload, so that the server reply is always structured.
> 
> Right.
> 
>> Hmm, I still think it would be hard to write a wireshark dissector for
>> server replies, since there is no indication whether a given reply will
>> include data or not.
> 
> Well, a wireshark dissector already exists. However, it is very old; it
> doesn't support the NBD_CMD_TRIM or NBD_CMD_FLUSH commands, it doesn't
> support the NBD_CMD_FLAG_FUA option to NBD_CMD_WRITE, and it doesn't
> deal with negotiation at all. It was written when newstyle negotiation
> didn't exist yet, and scanning for INIT_PASSWD at the time was too hard,
> according to the guy who wrote it (don't remember who that was).

And I've never written a wireshark dissector myself, to know how easy or
hard it would be to extend that work.

> 
>> The _client_ knows (well, any well-written client
>> that uses a different value for 'handle' for every command sent to the
>> server), because it can read the returned 'handle' field to know what
>> command it matches to and therefore what data to expect; but a
>> third-party observer seeing _just_ the server messages has no idea which
>> server responses should have payload.
> 
> It can if it knows enough about the protocol, but granted, that makes it
> harder for us to extend the protocol without breaking the dissector.
> 
>> Scanning the stream and assuming that a new reply starts each time
>> NBD_REPLY_MAGIC is encountered is a close approximation, but hits
>> false positives if the data being returned for NBD_CMD_READ contains
>> the same magic number as part of its contents.
> 
> Indeed.

One benefit of TCP: we can rely on packet boundaries (whether or not
fragmentation is also happening); I'm not sure if UDP shares the same
benefits (then again, I'm not even sure if UDP is usable for the NBD
protocol, since we definitely rely on in-order delivery of packets: a
read command can definitely return more bytes than even a jumbo frame
can contain, even though we do allow out-of-order delivery of replies).
 So if the server always sends each reply as the start of its own packet
(rather than coalescing multiple replies into a single network packet),
and the dissector only looks for magic numbers at the start of a packet,
then any server packet not starting with the magic number must be data
payload, and you could potentially even avoid the false positives by
choosing to break data replies into packets by adjusting packet lengths
by one byte any time the next data chunk would otherwise happen to start
with the same two bytes as the magic number.  But I haven't actually
tested any of this, to know if packet coalescing goes on, and I
certainly don't think it is worth complicating the server to avoid false
positive magic number detection by splitting read payloads across packet
boundaries, just for the benefit of wire-sniffing.

> I like this. However, before committing to it, I would like to see
> Markus' feedback on that (explicit Cc added, even though he's on the
> list, too).
> 
> We'd also need a way to communicate the ability to speak this protocol
> from the kernel to userspace before telling the server that the client
> understands something which maybe its kernel doesn't.

proto.md already documents that ioctl()s exist for the user space to
inform the kernel about options sent by the server prior to kicking off
transmission phase, and that the NBD protocol itself does not go into
full detail about available ioctl()s (the kernel/userspace coordination
does not affect the protocol).  It seems fairly straightforward for the
kernel to supply another ioctl() that userspace can use to learn whether
it is allowed or forbidden from advertising structured reply status
during the handshake phase (including the case where the ioctl() is not
present being treated as the client must not enable structured replies).

> 
> Markus?

Markus is on vacation for a bit, so we'll just have to wait for a reply.

> 
>> I could write up a negotiation of global flags for structured reply
>> lengths as an extension proposal, if you think it is worth it.
> 
> I think it is worth it...

Okay, I'll give it a shot, in a separate thread.

-- 
Eric Blake   eblake redhat com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 604 bytes --]

  parent reply	other threads:[~2016-03-28 15:59 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-03-23 14:16 [Qemu-devel] [PATCH 0/2] NBD protocol extensions: WRITE_ZEROES and GET_LBA_STATUS Denis V. Lunev
2016-03-23 14:16 ` [Qemu-devel] [PATCH 1/2] NBD proto: add WRITE_ZEROES extension Denis V. Lunev
2016-03-23 15:14   ` Eric Blake
2016-03-23 17:40     ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-24  7:16     ` [Qemu-devel] " Pavel Borzenkov
2016-03-24  7:36       ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-23 17:21   ` Wouter Verhelst
2016-03-24  7:57     ` Pavel Borzenkov
2016-03-24  8:26       ` Wouter Verhelst
2016-03-24 11:35         ` Pavel Borzenkov
2016-03-24 11:37         ` Paolo Bonzini
2016-03-24 12:31           ` Wouter Verhelst
2016-03-24 14:53         ` Eric Blake
2016-03-23 14:16 ` [Qemu-devel] [PATCH 2/2] NBD proto: add GET_LBA_STATUS extension Denis V. Lunev
2016-03-23 16:27   ` Eric Blake
2016-03-24 12:30     ` Pavel Borzenkov
2016-03-24 15:04       ` Eric Blake
2016-03-24 16:36         ` Pavel Borzenkov
2016-03-23 17:58   ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-23 18:14     ` Kevin Wolf
2016-03-24  8:25       ` Pavel Borzenkov
2016-03-24  8:41         ` Wouter Verhelst
2016-03-24 11:36           ` Pavel Borzenkov
2016-03-24 12:32             ` Wouter Verhelst
2016-03-24  8:43     ` Pavel Borzenkov
2016-03-24  9:33       ` Wouter Verhelst
2016-03-24 10:32         ` Alex Bligh
2016-03-24 11:58           ` Paolo Bonzini
2016-03-24 12:17             ` Alex Bligh
2016-03-24 12:32               ` Paolo Bonzini
2016-03-24 13:31                 ` Alex Bligh
2016-03-24 13:32                   ` Paolo Bonzini
2016-03-24 11:55     ` Paolo Bonzini
2016-03-24 12:43       ` Wouter Verhelst
2016-03-24 15:25       ` Eric Blake
2016-03-24 15:33         ` Paolo Bonzini
2016-03-24 15:53           ` Wouter Verhelst
2016-03-24 16:04             ` Eric Blake
2016-03-24 16:07               ` Kevin Wolf
2016-03-24 16:47                 ` Wouter Verhelst
2016-03-29  9:38                   ` Kevin Wolf
2016-03-29  9:53                     ` Wouter Verhelst
2016-03-29 10:25                     ` Paolo Bonzini
2016-03-24 22:08   ` [Qemu-devel] " Eric Blake
2016-03-25  8:49     ` [Qemu-devel] [Nbd] " Wouter Verhelst
2016-03-25  9:01       ` Alex Bligh
2016-03-28 15:58       ` Eric Blake [this message]
2016-04-04 10:32         ` Markus Pargmann
2016-04-04 10:18       ` Markus Pargmann
2016-04-04 16:54         ` Eric Blake
2016-04-04 22:17         ` Wouter Verhelst
2016-04-04 16:40   ` [Qemu-devel] " Eric Blake
2016-04-04 20:16   ` Denis V. Lunev
2016-04-04 20:36     ` [Qemu-devel] [Nbd] " Eric Blake

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=56F954BC.5010606@redhat.com \
    --to=eblake@redhat.com \
    --cc=den@openvz.org \
    --cc=kwolf@redhat.com \
    --cc=mpa@pengutronix.de \
    --cc=nbd-general@lists.sourceforge.net \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=stefanha@redhat.com \
    --cc=w@uter.be \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).