From mboxrd@z Thu Jan  1 00:00:00 1970
From: David Vrabel <david.vrabel@citrix.com>
Subject: Re: POD: soft lockups in dom0 kernel
Date: Fri, 6 Dec 2013 12:00:02 +0000
Message-ID: <52A1BC42.5040706@citrix.com>
References: <1538524.5AKIkpF9LB@amur>
	<52A1AE3E020000780010AC8E@nat28.tlf.novell.com>
	<52A1AFEB.3050308@citrix.com>
	<52A1C355020000780010AD40@nat28.tlf.novell.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
Received: from mail6.bemta3.messagelabs.com ([195.245.230.39])
	by lists.xen.org with esmtp (Exim 4.72)
	(envelope-from <david.vrabel@citrix.com>) id 1Vou4l-0005Od-Lf
	for xen-devel@lists.xenproject.org; Fri, 06 Dec 2013 12:00:07 +0000
In-Reply-To: <52A1C355020000780010AD40@nat28.tlf.novell.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Jan Beulich <JBeulich@suse.com>
Cc: xen-devel@lists.xenproject.org, Boris Ostrovsky <boris.ostrovsky@oracle.com>, Dietmar Hahn <dietmar.hahn@ts.fujitsu.com>
List-Id: xen-devel@lists.xenproject.org

On 06/12/13 11:30, Jan Beulich wrote:
>>>> On 06.12.13 at 12:07, David Vrabel <david.vrabel@citrix.com> wrote:
>> We do not want to disable the soft lockup detection here as it has found
>> a bug.  We can't have tasks that are unschedulable for minutes as it
>> would only take a handful of such tasks to hose the system.
> 
> My understanding is that the soft lockup detection is what its name
> says - a mechanism to find cases where the kernel software locked
> up. Yet that's not the case with long running hypercalls.

Well ok, it's not a lockup in the kernel but it's still a task that
cannot be descheduled for minutes of wallclock time.  This is still a
bug that needs to be fixed.

>> We should put an explicit preemption point in.  This will fix it for the
>> CONFIG_PREEMPT_VOLUNTARY case which I think is the most common
>> configuration.  Or perhaps this should even be a cond_reched() call to
>> fix it for fully non-preemptible as well.
> 
> How do you imagine to do this? When the hypervisor preempts a
> hypercall, all the kernel gets to see is that it drops back into the
> hypercall page, such that the next thing to happen would be
> re-execution of the hypercall. You can't call anything at that point,
> all that can get run here are interrupts (i.e. event upcalls). Or do
> you suggest to call cond_resched() from within
> __xen_evtchn_do_upcall()?

I've not looked at how.

> And even if you do - how certain is it that what gets its continuation
> deferred won't interfere with other things the kernel wants to do
> (since if you'd be doing it that way, you'd cover all hypercalls at
> once, not just those coming through privcmd, and hence you could
> end up with partially completed multicalls or other forms of batching,
> plus you'd need to deal with possibly active lazy modes).

I would only do this for hypercalls issued by the privcmd driver.

David