From mboxrd@z Thu Jan  1 00:00:00 1970
Received: from eggs.gnu.org ([2001:4830:134:3::10]:43784)
	by lists.gnu.org with esmtp (Exim 4.71)
	(envelope-from <paolo.bonzini@gmail.com>) id 1ZZFHX-0001mp-Nk
	for qemu-devel@nongnu.org; Tue, 08 Sep 2015 05:33:40 -0400
Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71)
	(envelope-from <paolo.bonzini@gmail.com>) id 1ZZFHS-00031U-Pl
	for qemu-devel@nongnu.org; Tue, 08 Sep 2015 05:33:39 -0400
Received: from mail-wi0-x229.google.com ([2a00:1450:400c:c05::229]:34015)
	by eggs.gnu.org with esmtp (Exim 4.71)
	(envelope-from <paolo.bonzini@gmail.com>) id 1ZZFHS-0002xd-Jq
	for qemu-devel@nongnu.org; Tue, 08 Sep 2015 05:33:34 -0400
Received: by wicfx3 with SMTP id fx3so112838043wic.1
	for <qemu-devel@nongnu.org>; Tue, 08 Sep 2015 02:33:12 -0700 (PDT)
Sender: Paolo Bonzini <paolo.bonzini@gmail.com>
References: <1441699228-25767-1-git-send-email-den@openvz.org>
From: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <55EEAB55.2070908@redhat.com>
Date: Tue, 8 Sep 2015 11:33:09 +0200
MIME-Version: 1.0
In-Reply-To: <1441699228-25767-1-git-send-email-den@openvz.org>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH RFC 0/5] disk deadlines
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <http://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
	<mailto:qemu-devel-request@nongnu.org?subject=subscribe>
To: "Denis V. Lunev" <den@openvz.org>
Cc: Kevin Wolf <kwolf@redhat.com>, qemu-devel@nongnu.org, Stefan Hajnoczi <stefanha@redhat.com>, Raushaniya Maksudova <rmaksudova@virtuozzo.com>


On 08/09/2015 10:00, Denis V. Lunev wrote:
> How the given solution works?
> 
> If disk-deadlines option is enabled for a drive, one controls time completion
> of this drive's requests. The method is as follows (further assume that this
> option is enabled).
> 
> Every drive has its own red-black tree for keeping its requests.
> Expiration time of the request is a key, cookie (as id of request) is an
> appropriate node. Assume that every requests has 8 seconds to be completed.
> If request was not accomplished in time for some reasons (server crash or smth
> else), timer of this drive is fired and an appropriate callback requests to
> stop Virtial Machine (VM).
> 
> VM remains stopped until all requests from the disk which caused VM's stopping
> are completed. Furthermore, if there is another disks with 'disk-deadlines=on'
> whose requests are waiting to be completed, do not start VM : wait completion
> of all "late" requests from all disks.
> 
> Furthermore, all requests which caused VM stopping (or those that just were not
> completed in time) could be printed using "info disk-deadlines" qemu monitor
> option as follows:

This topic has come up several times in the past.

I agree that the current behavior is not great, but I am not sure that
timeouts are safe.  For example, how is disk-deadlines=on different from
NFS soft mounts?  The NFS man page says

     NB: A so-called "soft" timeout can cause silent data corruption in
     certain cases.  As such, use the soft option only when client
     responsiveness is more important than data integrity.  Using NFS
     over TCP or increasing the value of the retrans option may
     mitigate some of the risks of using the soft option.

Note how it only says "mitigate", not solve.

Paolo