Re: [PATCH] xen: privcmd: schedule() after private hypercall when non CONFIG_PREEMPT

kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Juergen Gross <jgross@suse.com>
To: "Luis R. Rodriguez" <mcgrof@suse.com>,
	David Vrabel <david.vrabel@citrix.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>, Oleg Nesterov <oleg@redhat.com>,
	Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>,
	boris.ostrovsky@oracle.com, xen-devel@lists.xenproject.org,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	x86@kernel.org, kvm@vger.kernel.org,
	Davidlohr Bueso <dbueso@suse.de>, Joerg Roedel <jroedel@suse.de>,
	Borislav Petkov <bp@suse.de>, Jan Beulich <JBeulich@suse.com>,
	Olaf Hering <ohering@suse.de>
Subject: Re: [PATCH] xen: privcmd: schedule() after private hypercall when non CONFIG_PREEMPT
Date: Mon, 01 Dec 2014 18:07:48 +0100	[thread overview]
Message-ID: <547CA064.5080106@suse.com> (raw)
In-Reply-To: <20141201161905.GH25677@wotan.suse.de>

On 12/01/2014 05:19 PM, Luis R. Rodriguez wrote:
> On Mon, Dec 01, 2014 at 03:54:24PM +0000, David Vrabel wrote:
>> On 01/12/14 15:44, Luis R. Rodriguez wrote:
>>> On Mon, Dec 1, 2014 at 10:18 AM, David Vrabel <david.vrabel@citrix.com> wrote:
>>>> On 01/12/14 15:05, Luis R. Rodriguez wrote:
>>>>> On Mon, Dec 01, 2014 at 11:11:43AM +0000, David Vrabel wrote:
>>>>>> On 27/11/14 18:36, Luis R. Rodriguez wrote:
>>>>>>> On Thu, Nov 27, 2014 at 07:36:31AM +0100, Juergen Gross wrote:
>>>>>>>> On 11/26/2014 11:26 PM, Luis R. Rodriguez wrote:
>>>>>>>>> From: "Luis R. Rodriguez" <mcgrof@suse.com>
>>>>>>>>>
>>>>>>>>> Some folks had reported that some xen hypercalls take a long time
>>>>>>>>> to complete when issued from the userspace private ioctl mechanism,
>>>>>>>>> this can happen for instance with some hypercalls that have many
>>>>>>>>> sub-operations, this can happen for instance on hypercalls that use
>>>>>> [...]
>>>>>>>>> --- a/drivers/xen/privcmd.c
>>>>>>>>> +++ b/drivers/xen/privcmd.c
>>>>>>>>> @@ -60,6 +60,9 @@ static long privcmd_ioctl_hypercall(void __user *udata)
>>>>>>>>>                               hypercall.arg[0], hypercall.arg[1],
>>>>>>>>>                               hypercall.arg[2], hypercall.arg[3],
>>>>>>>>>                               hypercall.arg[4]);
>>>>>>>>> +#ifndef CONFIG_PREEMPT
>>>>>>>>> + schedule();
>>>>>>>>> +#endif
>>>>>>
>>>>>> As Juergen points out, this does nothing.  You need to schedule while in
>>>>>> the middle of the hypercall.
>>>>>>
>>>>>> Remember that Xen's hypercall preemption only preempts the hypercall to
>>>>>> run interrupts in the guest.
>>>>>
>>>>> How is it ensured that when the kernel preempts on this code path on
>>>>> CONFIG_PREEMPT=n kernel that only interrupts in the guest are run?
>>>>
>>>> Sorry, I really didn't describe this very well.
>>>>
>>>> If a hypercall needs a continuation, Xen returns to the guest with the
>>>> IP set to the hypercall instruction, and on the way back to the guest
>>>> Xen may schedule a different VCPU or it will do any upcalls (as per normal).
>>>>
>>>> The guest is free to return from the upcall to the original task
>>>> (continuing the hypercall) or to a different one.
>>>
>>> OK so that addresses what Xen will do when using continuation and
>>> hypercall preemption, my concern here was that using
>>> preempt_schedule_irq() on CONFIG_PREEMPT=n kernels in the middle of a
>>> hypercall on the return from an interrupt (e.g., the timer interrupt)
>>> would still let the kernel preempt to tasks other than those related
>>> to Xen.
>>
>> Um.  Why would that be a problem?  We do want to switch to any task the
>> Linux scheduler thinks is best.
>
> Its safe but -- it technically is doing kernel preemption, unless we want
> to adjust the definition of CONFIG_PREEMPT=n to exclude hypercalls. This
> was my original concern with the use of preempt_schedule_irq() to do this.
> I am afraid of setting precedents without being clear or wider review and
> acceptance.

I wonder whether it would be more acceptable to add (or completely
switch to) another preemption model: PREEMPT_SWITCHABLE. This would be
similar to CONFIG_PREEMPT, but the "normal" value of __preempt_count
would be settable via kernel parameter (default 2):

0: preempt
1: preempt_voluntary
2: preempt_none

The kernel would run with preemption enabled. cond_sched() would
reschedule if __preempt_count <= 1. And in case of long running kernel
activities (like the hypercall case or other stuff requiring schedule()
calls to avoid hangups) we would just set __preempt_count to 0 during
these periods and restore the old value afterwards.

This would be a rather intrusive but clean change IMO.

Any thoughts?


Juergen

next prev parent reply	other threads:[~2014-12-01 17:07 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-11-26 22:26 [PATCH] xen: privcmd: schedule() after private hypercall when non CONFIG_PREEMPT Luis R. Rodriguez
2014-11-27  6:36 ` Juergen Gross
2014-11-27 18:36   ` Luis R. Rodriguez
2014-11-27 18:46     ` Luis R. Rodriguez
2014-11-27 18:50     ` [Xen-devel] " Andrew Cooper
2014-11-28  4:49       ` Juergen Gross
2014-12-01 11:01         ` David Vrabel
2014-12-01 13:32           ` Luis R. Rodriguez
2014-12-01 14:42             ` Juergen Gross
2014-12-01 15:50               ` Luis R. Rodriguez
2014-11-28 21:51       ` Luis R. Rodriguez
2014-12-01 11:11     ` David Vrabel
2014-12-01 15:05       ` Luis R. Rodriguez
2014-12-01 15:18         ` David Vrabel
2014-12-01 15:44           ` Luis R. Rodriguez
2014-12-01 15:54             ` David Vrabel
2014-12-01 16:19               ` Luis R. Rodriguez
2014-12-01 17:07                 ` Juergen Gross [this message]
2014-12-01 17:52                   ` Luis R. Rodriguez
2014-12-01 18:16                 ` [Xen-devel] " David Vrabel
2014-12-01 22:36                   ` Luis R. Rodriguez
2014-12-02 11:11                     ` David Vrabel
2014-12-03  2:28                       ` Luis R. Rodriguez
2014-12-03  4:37                         ` Juergen Gross
2014-12-03 19:39                           ` Luis R. Rodriguez
2014-12-05 16:20                             ` Luis R. Rodriguez

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=547CA064.5080106@suse.com \
    --to=jgross@suse.com \
    --cc=JBeulich@suse.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@suse.de \
    --cc=david.vrabel@citrix.com \
    --cc=dbueso@suse.de \
    --cc=jroedel@suse.de \
    --cc=konrad.wilk@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mcgrof@suse.com \
    --cc=mingo@kernel.org \
    --cc=ohering@suse.de \
    --cc=oleg@redhat.com \
    --cc=peterz@infradead.org \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).