From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Wei Liu <wei.liu2@citrix.com>
Cc: Keir Fraser <keir@xen.org>, Jan Beulich <JBeulich@suse.com>,
xen-devel@lists.xen.org
Subject: Re: [PATCH] mm: ensure useful progress in decrease_reservation
Date: Wed, 26 Feb 2014 12:10:52 +0000 [thread overview]
Message-ID: <530DD9CC.3020008@citrix.com> (raw)
In-Reply-To: <1393415227-32092-1-git-send-email-wei.liu2@citrix.com>
On 26/02/14 11:47, Wei Liu wrote:
> During my fun time playing with balloon driver I found that hypervisor's
> preemption check kept decrease_reservation from doing any useful work
> for 32 bit guests, resulting in hanging the guests.
>
> As Andrew suggested, we can force the check to fail for the first
> iteration to ensure progress. We did this in d3a55d7d9 "x86/mm: Ensure
> useful progress in alloc_l2_table()" already.
>
> After this change I cannot see the hang caused by continuation logic
> anymore.
>
> Signed-off-by: Wei Liu <wei.liu2@citrix.com>
> Cc: Jan Beulich <JBeulich@suse.com>
> Cc: Keir Fraser <keir@xen.org>
Reviewed-by: Andrew Cooper <andrew.cooper3@citrix.com>
As discussed on IRC, this issue was reliably seen with 32bit HVM guests
only. The suspicion is that the compat layer is sufficiently long that
the there is always something pending by the time decrease_reservation()
got called.
This highlights that the fix for long-running hypercalls (starting with
XSA-45) is almost as bad as the long-running hypercalls themselves.
In XenServer, we have noticed that toolstack operations for
creating/migrating/destroying domains have started failing 22 second
softlockup timeouts, meaning that individual batched hypercalls (and
their continuations) are now exceeding 22 seconds of wallclock time.
In this case, 32bit HVM guests are reliably being locked-up by Xen,
meaning that for the duration of the vcpu being scheduled, Xen is
consistently bouncing in and out of non-root mode, and running the
compat layer over the hypercall parameters.
Even with the fix in place, 32bit HVM guests will be decreasing by a
single page for each bounce in and out of non-root mode and compat
layer, which is a staggering overhead and substantially worse than a bit
of time-skew.
The only solution I can see is for there to be an absolute minimum
amount of work Xen will do before even considering a continuation, and
for that minimum to be rather higher than it is at the moment.
~Andrew
> ---
> xen/common/memory.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/xen/common/memory.c b/xen/common/memory.c
> index 5a0efd5..9d0d32e 100644
> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -268,7 +268,7 @@ static void decrease_reservation(struct memop_args *a)
>
> for ( i = a->nr_done; i < a->nr_extents; i++ )
> {
> - if ( hypercall_preempt_check() )
> + if ( hypercall_preempt_check() && i != a->nr_done )
> {
> a->preempted = 1;
> goto out;
next prev parent reply other threads:[~2014-02-26 12:10 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-02-26 11:47 [PATCH] mm: ensure useful progress in decrease_reservation Wei Liu
2014-02-26 12:10 ` Andrew Cooper [this message]
2014-02-28 15:09 ` Keir Fraser
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=530DD9CC.3020008@citrix.com \
--to=andrew.cooper3@citrix.com \
--cc=JBeulich@suse.com \
--cc=keir@xen.org \
--cc=wei.liu2@citrix.com \
--cc=xen-devel@lists.xen.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.