linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Juergen Gross <jgross@suse.com>
To: "Luis R. Rodriguez" <mcgrof@do-not-panic.com>,
	david.vrabel@citrix.com, konrad.wilk@oracle.com,
	boris.ostrovsky@oracle.com, xen-devel@lists.xenproject.org
Cc: linux-kernel@vger.kernel.org, x86@kernel.org,
	kvm@vger.kernel.org, "Luis R. Rodriguez" <mcgrof@suse.com>,
	Davidlohr Bueso <dbueso@suse.de>, Joerg Roedel <jroedel@suse.de>,
	Borislav Petkov <bp@suse.de>, Jan Beulich <JBeulich@suse.com>,
	Olaf Hering <ohering@suse.de>
Subject: Re: [PATCH] xen: privcmd: schedule() after private hypercall when non CONFIG_PREEMPT
Date: Thu, 27 Nov 2014 07:36:31 +0100	[thread overview]
Message-ID: <5476C66F.5040308@suse.com> (raw)
In-Reply-To: <1417040805-15857-1-git-send-email-mcgrof@do-not-panic.com>

On 11/26/2014 11:26 PM, Luis R. Rodriguez wrote:
> From: "Luis R. Rodriguez" <mcgrof@suse.com>
>
> Some folks had reported that some xen hypercalls take a long time
> to complete when issued from the userspace private ioctl mechanism,
> this can happen for instance with some hypercalls that have many
> sub-operations, this can happen for instance on hypercalls that use
> multi-call feature whereby Xen lets one hypercall batch out a series
> of other hypercalls on the hypervisor. At times such hypercalls can
> even end up triggering the TASK_UNINTERRUPTIBLE hanger check (default
> 120 seconds), this a non-issue issue on preemptible kernels though as
> the kernel may deschedule such long running tasks. Xen for instance
> supports multicalls to be preempted as well, this is what Xen calls
> continuation (see xen commit 42217cbc5b which introduced this [0]).
> On systems without CONFIG_PREEMPT though -- a kernel with voluntary
> or no preemption -- a long running hypercall will not be descheduled
> until the hypercall is complete and the ioctl returns to user space.
>
> To help with this David had originally implemented support for use
> of preempt_schedule_irq() [1] for non CONFIG_PREEMPT kernels. This
> solution never went upstream though and upon review to help refactor
> this I've concluded that usage of preempt_schedule_irq() would be
> a bit abussive of existing APIs -- for a few reasons:
>
> 0) we want to avoid spreading its use on non CONFIG_PREEMPT kernels
>
> 1) we want try to consider solutions that might work for other
>     hypervisors for this same problem, and identify it its an issue
>     even present on other hypervisors or if this is a self
>     inflicted architectural issue caused by use of multicalls
>
> 2) there is no documentation or profiling of the exact hypercalls
>     that were causing these issues, nor do we have any context
>     to help evaluate this any further
>
> I at least checked with kvm folks and it seems hypercall preemption
> is not needed there. We can survey other hypervisors...
>
> If 'something like preemption' is needed then CONFIG_PREEMPT
> should just be enabled and encouraged, it seems we want to
> encourage CONFIG_PREEMPT on xen, specially when multicalls are
> used. In the meantime this tries to address a solution to help
> xen on non CONFIG_PREEMPT kernels.
>
> One option tested and evaluated was to put private hypercalls in
> process context, however this would introduce complexities such
> originating hypercalls from different contexts. Current xen
> hypercall callback handlers would need to be changed per architecture,
> for instance, we'd also incur the cost of switching states from
> user / kernel (this cost is also present if preempt_schedule_irq()
> is used). There may be other issues which could be introduced with
> this strategy as well. The simplest *shared* alternative is instead
> to just explicitly schedule() at the end of a private hypercall on non
> preempt kernels. This forces our private hypercall call mechanism
> to try to be fair only on non CONFIG_PREEMPT kernels at the cost of
> more context switch but keeps the private hypercall context intact.
>
> [0] http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=42217cbc5b3e84b8c145d8cfb62dd5de0134b9e8;hp=3a0b9c57d5c9e82c55dd967c84dd06cb43c49ee9
> [1] http://ftp.suse.com/pub/people/mcgrof/xen-preempt-hypercalls/0001-x86-xen-allow-privcmd-hypercalls-to-be-preempted.patch
>
> Cc: Davidlohr Bueso <dbueso@suse.de>
> Cc: Joerg Roedel <jroedel@suse.de>
> Cc: Borislav Petkov <bp@suse.de>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
> Cc: Jan Beulich <JBeulich@suse.com>
> Cc: Juergen Gross <JGross@suse.com>
> Cc: Olaf Hering <ohering@suse.de>
> Cc: David Vrabel <david.vrabel@citrix.com>
> Signed-off-by: Luis R. Rodriguez <mcgrof@suse.com>
> ---
>   drivers/xen/privcmd.c | 3 +++
>   1 file changed, 3 insertions(+)
>
> diff --git a/drivers/xen/privcmd.c b/drivers/xen/privcmd.c
> index 569a13b..e29edba 100644
> --- a/drivers/xen/privcmd.c
> +++ b/drivers/xen/privcmd.c
> @@ -60,6 +60,9 @@ static long privcmd_ioctl_hypercall(void __user *udata)
>   			   hypercall.arg[0], hypercall.arg[1],
>   			   hypercall.arg[2], hypercall.arg[3],
>   			   hypercall.arg[4]);
> +#ifndef CONFIG_PREEMPT
> +	schedule();
> +#endif
>
>   	return ret;
>   }
>

Sorry, I don't think this will solve anything. You're calling schedule()
right after the long running hypercall just nanoseconds before returning
to the user.

I suppose you were mislead by the "int 0x82" in [0]. This is the
hypercall from the kernel into the hypervisor, e.g. inside of
privcmd_call().


Juergen

  reply	other threads:[~2014-11-27  6:36 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-26 22:26 [PATCH] xen: privcmd: schedule() after private hypercall when non CONFIG_PREEMPT Luis R. Rodriguez
2014-11-27  6:36 ` Juergen Gross [this message]
2014-11-27 18:36   ` Luis R. Rodriguez
2014-11-27 18:46     ` Luis R. Rodriguez
2014-11-27 18:50     ` [Xen-devel] " Andrew Cooper
2014-11-28  4:49       ` Juergen Gross
2014-12-01 11:01         ` David Vrabel
2014-12-01 13:32           ` Luis R. Rodriguez
2014-12-01 14:42             ` Juergen Gross
2014-12-01 15:50               ` Luis R. Rodriguez
2014-11-28 21:51       ` Luis R. Rodriguez
2014-12-01 11:11     ` David Vrabel
2014-12-01 15:05       ` Luis R. Rodriguez
2014-12-01 15:18         ` David Vrabel
2014-12-01 15:44           ` Luis R. Rodriguez
2014-12-01 15:54             ` David Vrabel
2014-12-01 16:19               ` Luis R. Rodriguez
2014-12-01 17:07                 ` Juergen Gross
2014-12-01 17:52                   ` Luis R. Rodriguez
2014-12-01 18:16                 ` [Xen-devel] " David Vrabel
2014-12-01 22:36                   ` Luis R. Rodriguez
2014-12-02 11:11                     ` David Vrabel
2014-12-03  2:28                       ` Luis R. Rodriguez
2014-12-03  4:37                         ` Juergen Gross
2014-12-03 19:39                           ` Luis R. Rodriguez
2014-12-05 16:20                             ` Luis R. Rodriguez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5476C66F.5040308@suse.com \
    --to=jgross@suse.com \
    --cc=JBeulich@suse.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@suse.de \
    --cc=david.vrabel@citrix.com \
    --cc=dbueso@suse.de \
    --cc=jroedel@suse.de \
    --cc=konrad.wilk@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mcgrof@do-not-panic.com \
    --cc=mcgrof@suse.com \
    --cc=ohering@suse.de \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).