qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Stefan Hajnoczi <stefanha@redhat.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>, "Denis V. Lunev" <den@openvz.org>,
	qemu-devel@nongnu.org,
	Raushaniya Maksudova <rmaksudova@virtuozzo.com>
Subject: Re: [Qemu-devel] [PATCH RFC 0/5] disk deadlines
Date: Tue, 8 Sep 2015 11:22:17 +0100	[thread overview]
Message-ID: <20150908102217.GA12263@stefanha-thinkpad.redhat.com> (raw)
In-Reply-To: <55EEAB55.2070908@redhat.com>

On Tue, Sep 08, 2015 at 11:33:09AM +0200, Paolo Bonzini wrote:
> 
> 
> On 08/09/2015 10:00, Denis V. Lunev wrote:
> > How the given solution works?
> > 
> > If disk-deadlines option is enabled for a drive, one controls time completion
> > of this drive's requests. The method is as follows (further assume that this
> > option is enabled).
> > 
> > Every drive has its own red-black tree for keeping its requests.
> > Expiration time of the request is a key, cookie (as id of request) is an
> > appropriate node. Assume that every requests has 8 seconds to be completed.
> > If request was not accomplished in time for some reasons (server crash or smth
> > else), timer of this drive is fired and an appropriate callback requests to
> > stop Virtial Machine (VM).
> > 
> > VM remains stopped until all requests from the disk which caused VM's stopping
> > are completed. Furthermore, if there is another disks with 'disk-deadlines=on'
> > whose requests are waiting to be completed, do not start VM : wait completion
> > of all "late" requests from all disks.
> > 
> > Furthermore, all requests which caused VM stopping (or those that just were not
> > completed in time) could be printed using "info disk-deadlines" qemu monitor
> > option as follows:
> 
> This topic has come up several times in the past.
> 
> I agree that the current behavior is not great, but I am not sure that
> timeouts are safe.  For example, how is disk-deadlines=on different from
> NFS soft mounts?  The NFS man page says
> 
>      NB: A so-called "soft" timeout can cause silent data corruption in
>      certain cases.  As such, use the soft option only when client
>      responsiveness is more important than data integrity.  Using NFS
>      over TCP or increasing the value of the retrans option may
>      mitigate some of the risks of using the soft option.
> 
> Note how it only says "mitigate", not solve.

The risky part of "soft" mounts is probably that the client doesn't know
whether or not the request completed.  So it doesn't know the state of
the data on the server after a write request.  This is the classic
Byzantine fault tolerance problem in distributed systems.

This patch series pauses the guest like rerror=stop.  Therefore it's
different from NFS "soft" mounts, which are like rerror=report.

Guests running without this patch series may suffer from the NFS "soft"
mounts problem when they time out and give up on the I/O request just as
it actually completes on the server, leaving the data in a different
state than expected.

This patch series solves that problem by pausing the guest.  Action can
be taken on the host to bring storage back and resume (similar to
ENOSPC).

In order for this to work well, QEMU's timeout value must be shorter
than the guest's own timeout value.

Stefan

  parent reply	other threads:[~2015-09-08 10:22 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-08  8:00 [Qemu-devel] [PATCH RFC 0/5] disk deadlines Denis V. Lunev
2015-09-08  8:00 ` [Qemu-devel] [PATCH 1/5] add QEMU style defines for __sync_add_and_fetch Denis V. Lunev
2015-09-10  8:19   ` Stefan Hajnoczi
2015-09-08  8:00 ` [Qemu-devel] [PATCH 2/5] disk_deadlines: add request to resume Virtual Machine Denis V. Lunev
2015-09-10  8:51   ` Stefan Hajnoczi
2015-09-10 19:18     ` Denis V. Lunev
2015-09-14 16:46       ` Stefan Hajnoczi
2015-09-08  8:00 ` [Qemu-devel] [PATCH 3/5] disk_deadlines: add disk-deadlines option per drive Denis V. Lunev
2015-09-10  9:05   ` Stefan Hajnoczi
2015-09-08  8:00 ` [Qemu-devel] [PATCH 4/5] disk_deadlines: add control of requests time expiration Denis V. Lunev
2015-09-08  9:35   ` Fam Zheng
2015-09-08  9:42     ` Denis V. Lunev
2015-09-08 11:06   ` Kevin Wolf
2015-09-08 11:27     ` Denis V. Lunev
2015-09-08 13:05       ` Kevin Wolf
2015-09-08 14:23         ` Denis V. Lunev
2015-09-08 14:48           ` Kevin Wolf
2015-09-10 10:27             ` Stefan Hajnoczi
2015-09-10 11:39               ` Kevin Wolf
2015-09-14 16:53                 ` Stefan Hajnoczi
2015-09-25 12:34               ` Dr. David Alan Gilbert
2015-09-28 12:42                 ` Stefan Hajnoczi
2015-09-28 13:55                   ` Dr. David Alan Gilbert
2015-09-08  8:00 ` [Qemu-devel] [PATCH 5/5] disk_deadlines: add info disk-deadlines option Denis V. Lunev
2015-09-08 16:20   ` Eric Blake
2015-09-08 16:26     ` Eric Blake
2015-09-10 18:53       ` Denis V. Lunev
2015-09-10 19:13     ` Denis V. Lunev
2015-09-08  8:58 ` [Qemu-devel] [PATCH RFC 0/5] disk deadlines Vasiliy Tolstov
2015-09-08  9:20 ` Fam Zheng
2015-09-08 10:11   ` Kevin Wolf
2015-09-08 10:13     ` Denis V. Lunev
2015-09-08 10:20     ` Fam Zheng
2015-09-08 10:46       ` Denis V. Lunev
2015-09-08 10:49       ` Kevin Wolf
2015-09-08 13:20         ` Fam Zheng
2015-09-08  9:33 ` Paolo Bonzini
2015-09-08  9:41   ` Denis V. Lunev
2015-09-08  9:43     ` Paolo Bonzini
2015-09-08 10:37     ` Andrey Korolyov
2015-09-08 10:50       ` Denis V. Lunev
2015-09-08 10:07   ` Kevin Wolf
2015-09-08 10:08     ` Denis V. Lunev
2015-09-08 10:22   ` Stefan Hajnoczi [this message]
2015-09-08 10:26     ` Paolo Bonzini
2015-09-08 10:36     ` Denis V. Lunev
2015-09-08 19:11 ` John Snow
2015-09-10 19:29 ` [Qemu-devel] Summary: " Denis V. Lunev

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150908102217.GA12263@stefanha-thinkpad.redhat.com \
    --to=stefanha@redhat.com \
    --cc=den@openvz.org \
    --cc=kwolf@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rmaksudova@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).