From mboxrd@z Thu Jan  1 00:00:00 1970
From: Andrew Cooper <andrew.cooper3@citrix.com>
Subject: Re: [Patch v5 2/5] x86/hpet: Use singe apic vector
 rather than irq_descs for HPET interrupts
Date: Tue, 26 Nov 2013 18:32:04 +0000
Message-ID: <5294E924.7060903@citrix.com>
References: <20131114155203.GD42238@deinos.phlegethon.org>
	<1384444914-10215-1-git-send-email-andrew.cooper3@citrix.com>
	<528F8A1A0200007800105F6E@nat28.tlf.novell.com>
	<528F84FC.8000001@citrix.com>
	<52930F5B020000780010654D@nat28.tlf.novell.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Return-path: <xen-devel-bounces@lists.xen.org>
Received: from mail6.bemta4.messagelabs.com ([85.158.143.247])
	by lists.xen.org with esmtp (Exim 4.72)
	(envelope-from <Andrew.Cooper3@citrix.com>) id 1VlNQh-0002l1-Tk
	for xen-devel@lists.xenproject.org; Tue, 26 Nov 2013 18:32:12 +0000
In-Reply-To: <52930F5B020000780010654D@nat28.tlf.novell.com>
List-Unsubscribe: <http://lists.xen.org/cgi-bin/mailman/options/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=unsubscribe>
List-Post: <mailto:xen-devel@lists.xen.org>
List-Help: <mailto:xen-devel-request@lists.xen.org?subject=help>
List-Subscribe: <http://lists.xen.org/cgi-bin/mailman/listinfo/xen-devel>,
	<mailto:xen-devel-request@lists.xen.org?subject=subscribe>
Sender: xen-devel-bounces@lists.xen.org
Errors-To: xen-devel-bounces@lists.xen.org
To: Jan Beulich <JBeulich@suse.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>, Keir Fraser <keir@xen.org>
List-Id: xen-devel@lists.xenproject.org

On 25/11/13 07:50, Jan Beulich wrote:
>>>> On 22.11.13 at 17:23, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> On 22/11/13 15:45, Jan Beulich wrote:
>>>>>> On 14.11.13 at 17:01, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>>>> The new logic is as follows:
>>>>  * A single high priority vector is allocated and uses on all cpus.
>>> Does this really need to be a high priority one? I'd think we'd be
>>> fine with the lowest priority one we can get, as we only need the
>>> wakeup here if nothing else gets a CPU to wake up.
>> Yes - absolutely.  We cannot have an HPET interrupt lower priority than
>> a guest line level interrupt.
>>
>> Another cpu could be registered with our HPET channel to be worken up,
>> and we need to service them in a timely fashon.
> Which I meanwhile think hints at an issue with the (re)design:
> These wakeups, from an abstract pov, shouldn't be high
> priority interrupts - they're meant to wake a CPU only when
> nothing else would wake them in time. And this could be
> accomplished by transferring ownership of the channel during
> wakeup from the waking CPU to the next one to wake.
>
> WHich at once would eliminate the bogus logic selecting a channel
> for a CPU to re-use when no free one is available: It then wouldn't
> really matter which one gets re-used (i.e. could be assigned in e.g.
> a round robin fashion).
>
> The fundamental requirement would be to run the wakeup (in
> particular channel re-assignment) logic not just from the HPET
> interrupt, but inside an exit_idle() construct called from all IRQ
> paths (similar to how Linux does this).
>
> Jan
>

Irrespective of the problem of ownership, the HPET interrupt still needs
to be high priority.  Consider the following scenario:

* A line level interrupt is received on pcpu 0.  It is left outstanding
at the LAPIC.
* A domain is scheduled on pcpu 0, and has has an event injected for the
line level interrupt.
* The event handler takes a long time, and during the process, the
domains vcpu is rescheduled elsewhere
* pcpu0 is now completely idle and goes to sleep.

This scenario has pcpu 0 going to sleep with an outstanding line level
irq unacked at the LAPIC, with a low priority HPET interrupt blocked
until the domain has signalled the completion of the event.

There is no safe scenario (given Xen's handling of line level interrupt)
for timer interrupts to be lower priority than the highest possible line
level priority.

~Andrew