xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
From: Boris Ostrovsky <boris.ostrovsky@oracle.com>
To: David Vrabel <david.vrabel@citrix.com>
Cc: xen-devel@lists.xenproject.org,
	Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>,
	Jan Beulich <JBeulich@suse.com>
Subject: Re: POD: soft lockups in dom0 kernel
Date: Fri, 06 Dec 2013 09:50:17 -0500	[thread overview]
Message-ID: <52A1E429.1050902@oracle.com> (raw)
In-Reply-To: <52A1BC42.5040706@citrix.com>

On 12/06/2013 07:00 AM, David Vrabel wrote:
> On 06/12/13 11:30, Jan Beulich wrote:
>>>>> On 06.12.13 at 12:07, David Vrabel <david.vrabel@citrix.com> wrote:
>>> We do not want to disable the soft lockup detection here as it has found
>>> a bug.  We can't have tasks that are unschedulable for minutes as it
>>> would only take a handful of such tasks to hose the system.
>> My understanding is that the soft lockup detection is what its name
>> says - a mechanism to find cases where the kernel software locked
>> up. Yet that's not the case with long running hypercalls.
> Well ok, it's not a lockup in the kernel but it's still a task that
> cannot be descheduled for minutes of wallclock time.  This is still a
> bug that needs to be fixed.
>
>>> We should put an explicit preemption point in.  This will fix it for the
>>> CONFIG_PREEMPT_VOLUNTARY case which I think is the most common
>>> configuration.  Or perhaps this should even be a cond_reched() call to
>>> fix it for fully non-preemptible as well.
>> How do you imagine to do this? When the hypervisor preempts a
>> hypercall, all the kernel gets to see is that it drops back into the
>> hypercall page, such that the next thing to happen would be
>> re-execution of the hypercall. You can't call anything at that point,
>> all that can get run here are interrupts (i.e. event upcalls). Or do
>> you suggest to call cond_resched() from within
>> __xen_evtchn_do_upcall()?
> I've not looked at how.

KVM has a hook (kvm_check_and_clear_guest_paused()) into watchdog code 
to prevent it from having false positives (for a different reason 
though). If we claim that soft lockup mechanism is only to detect Linux 
kernel problems and not long-running hypervisor code then perhaps we can 
make this hook a bit more generic.

We would still need to think about what may happen if we are stuck in 
the hypervisor for abnormally long time. Maybe this Xen hook can still 
return false when such cases are detected.

-boris



>
>> And even if you do - how certain is it that what gets its continuation
>> deferred won't interfere with other things the kernel wants to do
>> (since if you'd be doing it that way, you'd cover all hypercalls at
>> once, not just those coming through privcmd, and hence you could
>> end up with partially completed multicalls or other forms of batching,
>> plus you'd need to deal with possibly active lazy modes).
> I would only do this for hypercalls issued by the privcmd driver.
>
> David

  parent reply	other threads:[~2013-12-06 14:47 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-12-05 13:55 POD: soft lockups in dom0 kernel Dietmar Hahn
2013-12-06 10:00 ` Jan Beulich
2013-12-06 11:07   ` David Vrabel
2013-12-06 11:30     ` Jan Beulich
2013-12-06 12:00       ` David Vrabel
2013-12-06 13:52         ` Dietmar Hahn
2013-12-06 14:58           ` David Vrabel
2013-12-06 14:50         ` Boris Ostrovsky [this message]
2014-01-16 11:10 ` Jan Beulich
2014-01-20 14:39   ` Andrew Cooper
2014-01-20 15:16     ` Jan Beulich
2014-01-29 14:12   ` Dietmar Hahn

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=52A1E429.1050902@oracle.com \
    --to=boris.ostrovsky@oracle.com \
    --cc=JBeulich@suse.com \
    --cc=david.vrabel@citrix.com \
    --cc=dietmar.hahn@ts.fujitsu.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).