[Qemu-devel] RFC: Let NBD client request read-only mode

qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed

* [Qemu-devel] RFC: Let NBD client request read-only mode
@ 2017-11-29 14:57 Eric Blake
  2017-11-30 15:32 ` Wouter Verhelst
  0 siblings, 1 reply; 4+ messages in thread
From: Eric Blake @ 2017-11-29 14:57 UTC (permalink / raw)
  To: nbd list; +Cc: Qemu-devel@nongnu.org, qemu block

Right now, only the server can choose whether an export is read-only.  A 
client can always treat an export as read-only by not sending any 
writes, but a server has no guarantee that a client will behave that 
way, and must assume that an export where the server did not advertise 
NBD_FLAG_READ_ONLY will modify the export.  Therefore, if the server 
does not want to permit simultaneous modifications to the underlying 
data, it has the choice of either permitting only one client at a time, 
or supporting multiple connections but enforcing all subsequent 
connections to see the NBD_FLAG_READ_ONLY bit on the export that is 
already in use by the first connection (note that this is racy - whoever 
connects first is the only one that can get write permissions, even if 
the first connected client doesn't want to write).

However, at least qemu has a case where it would be nice to permit a 
parallel known-read-only client from the same server that is (or will 
be) handling a read-write client; and what's more, to make it so that 
the read-only client can win the race of being the first connection 
without penalizing the actual read-write connection (see 
https://bugzilla.redhat.com/show_bug.cgi?id=1518543).  I don't see any 
way to accomplish this with oldstyle negotiation (but that doesn't 
matter these days); but with newstyle negotiation, there are at least 
two possible implementations:

Idea 1: the server advertises a new global bit NBD_FLAG_NO_WRITE (ideas 
for a better name?) in its 16-bit handshake flags; if the client replies 
with the same bit set (documentation-wise, we'd name the client reply 
NBD_FLAG_C_NO_WRITE), then the server knows that the client promises to 
be a read-only connection.

Idea 2: we add a new option, NBD_OPT_READ_ONLY.  If the client sends 
this option, and the server replies with NBD_REP_ACK, then the server 
knows that the client promises to be a read-only connection.

With either idea, once the server knows the client's intent to be a 
read-only client, the server SHOULD set NBD_FLAG_READ_ONLY on all 
(further) information sent for any export (whether from 
NBD_OPT_EXPORT_NAME, NBD_OPT_INFO, or NBD_OPT_GO) and treat any export 
as read-only for the current client, even if that export is in parallel 
use by another read-write client, and the client MUST NOT send 
NBD_CMD_WRITE, NBD_CMD_TRIM, NBD_CMD_WRITE_ZEROES, or any other command 
that requires a writable connection (the NBD_CMD_RESIZE extension comes 
to mind).

A client that wants to be read-only, but which does not see server 
support (in idea 1, the server did not advertise the bit; in idea 2, the 
server replies with NBD_REP_ERR_UNSUP), does not have to do anything 
special (it is always possible to do just reads to a read-write 
connection, and the server may still set NBD_FLAG_READ_ONLY even without 
supporting the extension of permitting a client-side request).  But such 
a client may, if it wants to be nice to potential parallel writers on 
the same export, decide to disconnect quickly (with NBD_OPT_ABORT or 
NBD_CMD_DISC as appropriate) rather than tie up a read-write connection.

I don't know which idea is more palatable.  We have a finite set of only 
2^4 global handshake flags because it is a bitmask, where only 14 bits 
remain; whereas we have almost 2^32 potential NBD_OPT_ values.  On the 
other hand, using a global handshake flag means the server never shows 
any export as writable; while with the NBD_OPT_ solution, a guest can 
get different results for the sequence NBD_OPT_INFO, NBD_OPT_READ_ONLY, 
NBD_OPT_INFO.  There's also the question with option 2 of whether 
permitting NBD_OPT_READ_ONLY prior to NBD_OPT_STARTTLS would make sense 
(is there any case where the set of TLS authentication to be performed 
can involve looser requirements for a known-read-only client?), where 
using a global bit makes the sequence of required NBD_OPT_* a bit less 
stateful.

Does the idea sound reasonable enough to propose wording to add it to 
the NBD spec and an implementation in qemu?  Which of the two ideas is 
preferred for letting the client inform the server of its intent?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] RFC: Let NBD client request read-only mode
  2017-11-29 14:57 [Qemu-devel] RFC: Let NBD client request read-only mode Eric Blake
@ 2017-11-30 15:32 ` Wouter Verhelst
  2017-11-30 16:00   ` Eric Blake
  0 siblings, 1 reply; 4+ messages in thread
From: Wouter Verhelst @ 2017-11-30 15:32 UTC (permalink / raw)
  To: Eric Blake; +Cc: nbd list, Qemu-devel@nongnu.org, qemu block

On Wed, Nov 29, 2017 at 08:57:20AM -0600, Eric Blake wrote:
> Right now, only the server can choose whether an export is read-only.  A
> client can always treat an export as read-only by not sending any writes,
> but a server has no guarantee that a client will behave that way, and must
> assume that an export where the server did not advertise NBD_FLAG_READ_ONLY
> will modify the export.  Therefore, if the server does not want to permit
> simultaneous modifications to the underlying data, it has the choice of
> either permitting only one client at a time, or supporting multiple
> connections but enforcing all subsequent connections to see the
> NBD_FLAG_READ_ONLY bit on the export that is already in use by the first
> connection (note that this is racy - whoever connects first is the only one
> that can get write permissions, even if the first connected client doesn't
> want to write).
> 
> However, at least qemu has a case where it would be nice to permit a
> parallel known-read-only client from the same server that is (or will be)
> handling a read-write client; and what's more, to make it so that the
> read-only client can win the race of being the first connection without
> penalizing the actual read-write connection (see
> https://bugzilla.redhat.com/show_bug.cgi?id=1518543).

Right, I can see the dilemma.

(a possible workaround could be that the server could have two versions of the
same export, one which is marked read-only and one which is not, but that is a
bit ugly)

> I don't see any way to accomplish this with oldstyle negotiation (but that
> doesn't matter these days); but with newstyle negotiation, there are at least
> two possible implementations:
> 
> Idea 1: the server advertises a new global bit NBD_FLAG_NO_WRITE (ideas for
> a better name?) in its 16-bit handshake flags; if the client replies with
> the same bit set (documentation-wise, we'd name the client reply
> NBD_FLAG_C_NO_WRITE), then the server knows that the client promises to be a
> read-only connection.

I'd rather not burn a global bit for this.

> Idea 2: we add a new option, NBD_OPT_READ_ONLY.  If the client sends this
> option, and the server replies with NBD_REP_ACK, then the server knows that
> the client promises to be a read-only connection.
> 
> With either idea, once the server knows the client's intent to be a
> read-only client, the server SHOULD set NBD_FLAG_READ_ONLY on all (further)
> information sent for any export (whether from NBD_OPT_EXPORT_NAME,
> NBD_OPT_INFO, or NBD_OPT_GO) and treat any export as read-only for the
> current client, even if that export is in parallel use by another read-write
> client, and the client MUST NOT send NBD_CMD_WRITE, NBD_CMD_TRIM,
> NBD_CMD_WRITE_ZEROES, or any other command that requires a writable
> connection (the NBD_CMD_RESIZE extension comes to mind).

Right.

> A client that wants to be read-only, but which does not see server support
> (in idea 1, the server did not advertise the bit; in idea 2, the server
> replies with NBD_REP_ERR_UNSUP), does not have to do anything special (it is
> always possible to do just reads to a read-write connection, and the server
> may still set NBD_FLAG_READ_ONLY even without supporting the extension of
> permitting a client-side request).  But such a client may, if it wants to be
> nice to potential parallel writers on the same export, decide to disconnect
> quickly (with NBD_OPT_ABORT or NBD_CMD_DISC as appropriate) rather than tie
> up a read-write connection.

Indeed.

> I don't know which idea is more palatable.  We have a finite set of only 2^4
> global handshake flags because it is a bitmask, where only 14 bits remain;
> whereas we have almost 2^32 potential NBD_OPT_ values.  On the other hand,
> using a global handshake flag means the server never shows any export as
> writable; while with the NBD_OPT_ solution, a guest can get different
> results for the sequence NBD_OPT_INFO, NBD_OPT_READ_ONLY, NBD_OPT_INFO.

It might additionally also be a good idea to add another data item to
the NBD_OPT_INFO response which tells the client that it will be the
only writer, but that there may be other readers.

That way, if a client sees that data item, it could go "oh, but I don't
need to write -- here's an NBD_OPT_READ_ONLY for you".

> There's also the question with option 2 of whether permitting
> NBD_OPT_READ_ONLY prior to NBD_OPT_STARTTLS would make sense (is there any
> case where the set of TLS authentication to be performed can involve looser
> requirements for a known-read-only client?),

Yes, but if a server wants to allow writing to a device based on whether
a client is authenticated or not, then all it needs to do is to set the
read-only flag based on whether that client is authenticated or not.

It might still be useful to signal to a client somehow that it could get
more rights if it provided authentication credentials of some sort, but
that is not entirely related to whether or not a client declares that it
will write to the device

> where using a global bit makes the sequence of required NBD_OPT_* a
> bit less stateful.
> 
> Does the idea sound reasonable enough to propose wording to add it to the
> NBD spec and an implementation in qemu?  Which of the two ideas is preferred
> for letting the client inform the server of its intent?

I think it sounds reasonable enough, yes; but I also think there are a
few other related situations that might be relevant enough to warrant
thinking about more. I gave a few examples above, but maybe there are
more? Dunno.

-- 
Could you people please use IRC like normal people?!?

  -- Amaya Rodrigo Sastre, trying to quiet down the buzz in the DebConf 2008
     Hacklab

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] RFC: Let NBD client request read-only mode
  2017-11-30 15:32 ` Wouter Verhelst
@ 2017-11-30 16:00   ` Eric Blake
  2017-11-30 17:43     ` Wouter Verhelst
  0 siblings, 1 reply; 4+ messages in thread
From: Eric Blake @ 2017-11-30 16:00 UTC (permalink / raw)
  To: Wouter Verhelst; +Cc: nbd list, Qemu-devel@nongnu.org, qemu block

On 11/30/2017 09:32 AM, Wouter Verhelst wrote:

>> A client that wants to be read-only, but which does not see server support
>> (in idea 1, the server did not advertise the bit; in idea 2, the server
>> replies with NBD_REP_ERR_UNSUP), does not have to do anything special (it is
>> always possible to do just reads to a read-write connection, and the server
>> may still set NBD_FLAG_READ_ONLY even without supporting the extension of
>> permitting a client-side request).  But such a client may, if it wants to be
>> nice to potential parallel writers on the same export, decide to disconnect
>> quickly (with NBD_OPT_ABORT or NBD_CMD_DISC as appropriate) rather than tie
>> up a read-write connection.
> 
> Indeed.
> 
>> I don't know which idea is more palatable.  We have a finite set of only 2^4
>> global handshake flags because it is a bitmask, where only 14 bits remain;
>> whereas we have almost 2^32 potential NBD_OPT_ values.  On the other hand,
>> using a global handshake flag means the server never shows any export as
>> writable; while with the NBD_OPT_ solution, a guest can get different
>> results for the sequence NBD_OPT_INFO, NBD_OPT_READ_ONLY, NBD_OPT_INFO.
> 
> It might additionally also be a good idea to add another data item to
> the NBD_OPT_INFO response which tells the client that it will be the
> only writer, but that there may be other readers.
> 
> That way, if a client sees that data item, it could go "oh, but I don't
> need to write -- here's an NBD_OPT_READ_ONLY for you".

There's also the question of inconsistent reads - normally, if a client 
is the only reader, the client can cache data rather than having to 
re-read it from the server; whereas if there is another writer in 
parallel, the client SHOULD read data again rather than relying on the 
cache (since the other writer may have changed data in the meantime) - 
so maybe having a way for the server to report whether reads may be 
inconsistent, or even give an error to NBD_OPT_GO unless the client 
requests (via NBD_OPT_READ_ONLY or some other way) that the client is 
aware of the potential for volatile/inconsistent reads.

> I think it sounds reasonable enough, yes; but I also think there are a
> few other related situations that might be relevant enough to warrant
> thinking about more. I gave a few examples above, but maybe there are
> more? Dunno.

Okay, sounds like it warrants enough potential for conversation that I 
should write it up as a patch, and the patch may need a new extension- 
branch rather than going straight into mainline; and I'll stick with 
idea 2 (NBD_OPT_READ_ONLY) rather than burning a global handshake bit. 
I hope to give the documentation patch a shot in the next few days.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [Qemu-devel] RFC: Let NBD client request read-only mode
  2017-11-30 16:00   ` Eric Blake
@ 2017-11-30 17:43     ` Wouter Verhelst
  0 siblings, 0 replies; 4+ messages in thread
From: Wouter Verhelst @ 2017-11-30 17:43 UTC (permalink / raw)
  To: Eric Blake; +Cc: nbd list, Qemu-devel@nongnu.org, qemu block

On Thu, Nov 30, 2017 at 10:00:46AM -0600, Eric Blake wrote:
> On 11/30/2017 09:32 AM, Wouter Verhelst wrote:
> > > A client that wants to be read-only, but which does not see server support
> > > (in idea 1, the server did not advertise the bit; in idea 2, the server
> > > replies with NBD_REP_ERR_UNSUP), does not have to do anything special (it is
> > > always possible to do just reads to a read-write connection, and the server
> > > may still set NBD_FLAG_READ_ONLY even without supporting the extension of
> > > permitting a client-side request).  But such a client may, if it wants to be
> > > nice to potential parallel writers on the same export, decide to disconnect
> > > quickly (with NBD_OPT_ABORT or NBD_CMD_DISC as appropriate) rather than tie
> > > up a read-write connection.
> > 
> > Indeed.
> > 
> > > I don't know which idea is more palatable.  We have a finite set of only 2^4
> > > global handshake flags because it is a bitmask, where only 14 bits remain;
> > > whereas we have almost 2^32 potential NBD_OPT_ values.  On the other hand,
> > > using a global handshake flag means the server never shows any export as
> > > writable; while with the NBD_OPT_ solution, a guest can get different
> > > results for the sequence NBD_OPT_INFO, NBD_OPT_READ_ONLY, NBD_OPT_INFO.
> > 
> > It might additionally also be a good idea to add another data item to
> > the NBD_OPT_INFO response which tells the client that it will be the
> > only writer, but that there may be other readers.
> > 
> > That way, if a client sees that data item, it could go "oh, but I don't
> > need to write -- here's an NBD_OPT_READ_ONLY for you".
> 
> There's also the question of inconsistent reads - normally, if a client is
> the only reader, the client can cache data rather than having to re-read it
> from the server; whereas if there is another writer in parallel, the client
> SHOULD read data again rather than relying on the cache (since the other
> writer may have changed data in the meantime) - so maybe having a way for
> the server to report whether reads may be inconsistent, or even give an
> error to NBD_OPT_GO unless the client requests (via NBD_OPT_READ_ONLY or
> some other way) that the client is aware of the potential for
> volatile/inconsistent reads.

I'm wary of doing this.

We can never guarantee 100% that there is no writer (some other process
might write to the backend behind the server's back and it would have no
way of knowing that).

Adding a message "there is another writer" would imply that the absense
of that message means "there is no other writer", which would then be
incorrect. As such, I don't think this is something we can properly cope
with at the NBD level.

With good arguments I could be convinced otherwise, however :-)

> > I think it sounds reasonable enough, yes; but I also think there are a
> > few other related situations that might be relevant enough to warrant
> > thinking about more. I gave a few examples above, but maybe there are
> > more? Dunno.
> 
> Okay, sounds like it warrants enough potential for conversation that I
> should write it up as a patch, and the patch may need a new extension-
> branch rather than going straight into mainline;

Yeah, probably best to do that.

> and I'll stick with idea 2 (NBD_OPT_READ_ONLY) rather than burning a
> global handshake bit. I hope to give the documentation patch a shot in
> the next few days.

Thanks.

-- 
Could you people please use IRC like normal people?!?

  -- Amaya Rodrigo Sastre, trying to quiet down the buzz in the DebConf 2008
     Hacklab

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-12-01 17:27 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-11-29 14:57 [Qemu-devel] RFC: Let NBD client request read-only mode Eric Blake
2017-11-30 15:32 ` Wouter Verhelst
2017-11-30 16:00   ` Eric Blake
2017-11-30 17:43     ` Wouter Verhelst

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).