From mboxrd@z Thu Jan  1 00:00:00 1970
From: Andrew Cooper <andrew.cooper3@citrix.com>
Subject: Re: [PATCH] mm: ensure useful progress in
	decrease_reservation
Date: Wed, 26 Feb 2014 12:10:52 +0000
Message-ID: <530DD9CC.3020008@citrix.com>
References: <1393415227-32092-1-git-send-email-wei.liu2@citrix.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
In-Reply-To: <1393415227-32092-1-git-send-email-wei.liu2@citrix.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Wei Liu <wei.liu2@citrix.com>
Cc: Keir Fraser <keir@xen.org>, Jan Beulich <JBeulich@suse.com>, xen-devel@lists.xen.org
List-Id: xen-devel@lists.xenproject.org

On 26/02/14 11:47, Wei Liu wrote:
> During my fun time playing with balloon driver I found that hypervisor's
> preemption check kept decrease_reservation from doing any useful work
> for 32 bit guests, resulting in hanging the guests.
>
> As Andrew suggested, we can force the check to fail for the first
> iteration to ensure progress. We did this in d3a55d7d9 "x86/mm: Ensure
> useful progress in alloc_l2_table()" already.
>
> After this change I cannot see the hang caused by continuation logic
> anymore.
>
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Cc: Jan Beulich <JBeulich@suse.com>
> Cc: Keir Fraser <keir@xen.org>

Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>

As discussed on IRC, this issue was reliably seen with 32bit HVM guests
only.  The suspicion is that the compat layer is sufficiently long that
the there is always something pending by the time decrease_reservation()
got called.

This highlights that the fix for long-running hypercalls (starting with
XSA-45) is almost as bad as the long-running hypercalls themselves.

In XenServer, we have noticed that toolstack operations for
creating/migrating/destroying domains have started failing 22 second
softlockup timeouts, meaning that individual batched hypercalls (and
their continuations) are now exceeding 22 seconds of wallclock time.

In this case, 32bit HVM guests are reliably being locked-up by Xen,
meaning that for the duration of the vcpu being scheduled, Xen is
consistently bouncing in and out of non-root mode, and running the
compat layer over the hypercall parameters.

Even with the fix in place, 32bit HVM guests will be decreasing by a
single page for each bounce in and out of non-root mode and compat
layer, which is a staggering overhead and substantially worse than a bit
of time-skew.

The only solution I can see is for there to be an absolute minimum
amount of work Xen will do before even considering a continuation, and
for that minimum to be rather higher than it is at the moment.

~Andrew

> ---
>  xen/common/memory.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/xen/common/memory.c b/xen/common/memory.c
> index 5a0efd5..9d0d32e 100644
> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -268,7 +268,7 @@ static void decrease_reservation(struct memop_args *a)
>  
>      for ( i = a->nr_done; i < a->nr_extents; i++ )
>      {
> -        if ( hypercall_preempt_check() )
> +        if ( hypercall_preempt_check() && i != a->nr_done )
>          {
>              a->preempted = 1;
>              goto out;