From: Nicholas Thomas <nick@bytemark.co.uk>
To: Jamie Lokier <jamie@shareable.org>
Cc: Kevin Wolf <kwolf@redhat.com>,
pbonzini@redhat.com, qemu-devel@nongnu.org
Subject: Re: [Qemu-devel] [PATCH 1/3] nbd: Only try to send flush/discard commands if connected to the NBD server
Date: Wed, 24 Oct 2012 13:16:34 +0100 [thread overview]
Message-ID: <1351080994.16343.293.camel@eboracum.office.bytemark.co.uk> (raw)
In-Reply-To: <20121023150225.GB17080@jl-vm1.vm.bytemark.co.uk>
On Tue, 2012-10-23 at 16:02 +0100, Jamie Lokier wrote:
> Nicholas Thomas wrote:
> > On Tue, 2012-10-23 at 12:33 +0200, Kevin Wolf wrote:
> > > Am 22.10.2012 13:09, schrieb nick@bytemark.co.uk:
> > > >
> > > > This is unlikely to come up now, but is a necessary prerequisite for reconnection
> > > > behaviour.
> > > >
> > > > Signed-off-by: Nick Thomas <nick@bytemark.co.uk>
> > > > ---
> > > > block/nbd.c | 13 +++++++++++--
> > > > 1 files changed, 11 insertions(+), 2 deletions(-)
> > >
> > > What's the real requirement here? Silently ignoring a flush and
> > > returning success for it feels wrong. Why is it correct?
> > >
> > > Kevin
> >
> > I just needed to avoid socket operations while s->sock == -1, and
> > extending the existing case of "can't do the command, so pretend I did
> > it" to "can't do the command right now, so pretend..." seemed like an
> > easy way out.
>
> Hi Nicholas,
>
> Ignoring a flush is another way of saying "corrupt my data" in some
> circumstances. We have options in QEMU already to say whether flushes
> are ignored on normal discs, but if someone's chosen the "I really
> care about my database/filesystem" option, and verified that their NBD
> setup really performs them (in normal circumstances), silently
> dropping flushes from time to time isn't nice.
>
> I would much rather the guest is forced to wait until reconnection and
> then get a successful flush, if the problem is just that the server
> was done briefly. Or, if that is too hard, that the flush is
OK, that's certainly doable. Revised patches hopefully showing up this
week or (at a push) next.
> Since the I/O _order_ before, and sometimes after, flush, is important
> for data integrity, this needs to be maintained when I/Os are queued in
> the disconnected state -- including those which were inflight at the
> time disconnect was detected and then retried on reconnect.
Hmm, discussing this on IRC I was told that it wasn't necessary to
preserve order - although I forget the fine detail. Depending on the
implementation of qemu's coroutine mutexes, operations may not actually
be performed in order right now - it's not too easy to work out what's
happening.
I've also just noticed that flush & discard don't take the send_mutex
before writing to the socket. That can't be intentional, surely? Paolo?
I've got a proof-of-concept that makes flush, discard, readv_1 and
writev_1 call a common nbd_co_handle_request function. this retries I/Os
when a connection goes away and comes back, and lets me make this patch
unnecessary. It doesn't preserve I/O ordering, but it won't be too hard
to add that if it is actually necessary.
> Ignoring a discard is not too bad. However, if discard is retried,
> then I/O order is important in relation to those as well.
>
> > In the Bytemark case, the NBD server always opens the file O_SYNC, so
> > nbd_co_flush could check in_flight == 0 and return 0/1 based on that;
> > but I'd be surprised if that's true for all NBD servers. Should we be
> > returning 1 here for both "not supported" and "can't do it right now",
> > instead?
>
> When the server is opening the file O_SYNC, wouldn't it make sense to
> tell QEMU -- and the guest -- that there's no need to send flushes at
> all, as it's equivalent to a disk with no write-cache (or disabled)?
Probably; our NBD server still implements the old-style handshakes, as
far as I recall, and doesn't actually implement the flush or discard
commands as a result.
next prev parent reply other threads:[~2012-10-24 12:16 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-10-22 11:09 [Qemu-devel] [PATCH 0/3] NBD reconnection behaviour nick
2012-10-22 11:09 ` [Qemu-devel] [PATCH 1/3] nbd: Only try to send flush/discard commands if connected to the NBD server nick
2012-10-23 10:33 ` Kevin Wolf
2012-10-23 11:08 ` Nicholas Thomas
2012-10-23 11:26 ` Kevin Wolf
2012-10-23 15:02 ` Jamie Lokier
2012-10-24 12:16 ` Nicholas Thomas [this message]
2012-10-24 12:57 ` Kevin Wolf
2012-10-24 14:32 ` Jamie Lokier
2012-10-24 15:16 ` Paolo Bonzini
2012-10-25 6:36 ` Kevin Wolf
2012-10-25 17:09 ` Jamie Lokier
2012-10-26 7:59 ` Kevin Wolf
2012-10-24 14:03 ` Paolo Bonzini
2012-10-24 14:10 ` Paolo Bonzini
2012-10-24 14:12 ` Nicholas Thomas
2012-10-22 11:09 ` [Qemu-devel] [PATCH 2/3] nbd: Explicitly disconnect and fail inflight I/O requests on error, then reconnect next I/O request nick
2012-10-23 10:40 ` Kevin Wolf
2012-10-24 14:31 ` Paolo Bonzini
2012-10-22 11:09 ` [Qemu-devel] [PATCH 3/3] nbd: Move reconnection attempts from each new I/O request to a 5-second timer nick
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1351080994.16343.293.camel@eboracum.office.bytemark.co.uk \
--to=nick@bytemark.co.uk \
--cc=jamie@shareable.org \
--cc=kwolf@redhat.com \
--cc=pbonzini@redhat.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).