Re: [Patch v5 2/5] x86/hpet: Use singe apic vector rather than irq_descs for HPET interrupts

xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed

From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>, Keir Fraser <keir@xen.org>
Subject: Re: [Patch v5 2/5] x86/hpet: Use singe apic vector rather than irq_descs for HPET interrupts
Date: Fri, 22 Nov 2013 16:23:24 +0000	[thread overview]
Message-ID: <528F84FC.8000001@citrix.com> (raw)
In-Reply-To: <528F8A1A0200007800105F6E@nat28.tlf.novell.com>

On 22/11/13 15:45, Jan Beulich wrote:
>>>> On 14.11.13 at 17:01, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> The new logic is as follows:
>>  * A single high priority vector is allocated and uses on all cpus.
> Does this really need to be a high priority one? I'd think we'd be
> fine with the lowest priority one we can get, as we only need the
> wakeup here if nothing else gets a CPU to wake up.

Yes - absolutely.  We cannot have an HPET interrupt lower priority than
a guest line level interrupt.

Another cpu could be registered with our HPET channel to be worken up,
and we need to service them in a timely fashon.

>
>> +/* Wake up all cpus in the channel mask.  Lock should be held. */
>> +static void hpet_wake_cpus(struct hpet_event_channel *ch)
>>  {
>> -    unsigned int cpu = smp_processor_id();
>> +    cpuidle_wakeup_mwait(ch->cpumask);
>>  
>> -    if ( cpumask_test_and_clear_cpu(cpu, mask) )
>> -        raise_softirq(TIMER_SOFTIRQ);
>> -
>> -    cpuidle_wakeup_mwait(mask);
>> -
>> -    if ( !cpumask_empty(mask) )
>> -       cpumask_raise_softirq(mask, TIMER_SOFTIRQ);
>> +    if ( !cpumask_empty(ch->cpumask) )
>> +       cpumask_raise_softirq(ch->cpumask, TIMER_SOFTIRQ);
> Looks like the cpumask_empty() check isn't really needed here?

cpuidle_wakeup_mwait(mask) will only wake cpus who have set their bit in
cpuidle_mwait_flags.

There might be cpus who have registered for waking up who have not yet
set their bit in cpuidle_mwait_flags.

Out of caution, I did by best to avoid playing with the guts of the
timing code, as it seems quite fragile.

>
>> +/* HPET interrupt entry.  This is set up as a high priority vector. */
>> +static void do_hpet_irq(struct cpu_user_regs *regs)
>>  {
>> -    struct hpet_event_channel *ch = (struct hpet_event_channel *)data;
>> -
>> -    this_cpu(irq_count)--;
>> +    unsigned int cpu = smp_processor_id();
> This is being used just once, and hence rather pointless to have.

Fair point - this was left over from a previous version of the function
which did use cpu twice.

>
>> -    desc->handler = &hpet_msi_type;
>> -    ret = request_irq(ch->msi.irq, hpet_interrupt_handler, "HPET", ch);
>> -    if ( ret >= 0 )
>> -        ret = __hpet_setup_msi_irq(desc);
>> +    ret = hpet_setup_msi(ch);
> Why did you keep this? With that function now being called from
> hpet_broadcast_enter() I don't why you'd need this here (just
> like you're removing it from hpet_broadcast_resume() without
> replacement).

Because I optimised functions in the wrong order to obviously spot
this.  As we leave the MSI masked, this call to setup can be dropped.

I would however prefer not to manually inline hpet_setup_msi() into
hpet_broadcast_enter().  The compiler can do that for me, and it saves
making hpet_broadcast_enter() even more complicated.

>
>> @@ -574,6 +384,7 @@ void __init hpet_broadcast_init(void)
>>          /* Stop HPET legacy interrupts */
>>          cfg &= ~HPET_CFG_LEGACY;
>>          n = num_hpets_used;
>> +        free_channels = (1U << n) - 1;
> This is undefined when n == 32. Since n > 0, I'd suggest
>
>         free_channels = (u32)~0 >> (32 - n);

Ok

>
>> +    ch = hpet_get_free_channel();
>> +
>> +    if ( ch )
>> +    {
>> +        spin_lock(&ch->lock);
>> +
>> +        /* This really should be an MSI channel by this point */
>> +        ASSERT(!(ch->flags & HPET_EVT_LEGACY));
>> +
>> +        hpet_msi_mask(ch);
>> +
>> +        this_cpu(hpet_channel) = ch;
>> +        ch->cpu = cpu;
> I think even if benign, you should set ->cpu before storing into
> hpet_channel.

Ok

>
>> +        cpumask_set_cpu(cpu, ch->cpumask);
>> +
>> +        hpet_setup_msi(ch);
>> +        hpet_program_time(ch, deadline, NOW(), 1);
>> +        hpet_msi_unmask(ch);
>> +
>> +        spin_unlock(&ch->lock);
>> +    }
>> +    else
>> +    {
>> +        /* TODO - this seems very ugly */
> And you nevertheless want it committed this way?

Probably best the comment gets dropped.

>
>> +        s_time_t fallback_deadline = STIME_MAX;
>> +        unsigned int i, fallback_idx = -1;
>> +
>> +        for ( i = 0; i < num_hpets_used; ++i )
>> +        {
>> +            ch = &hpet_events[i];
>> +            spin_lock(&ch->lock);
>> +
>> +            if ( ch->cpu == -1 )
>> +                goto continue_search;
>> +
>> +            /* This channel is going to expire far too early */
>> +            if ( ch->next_event < deadline - MICROSECS(50) )
>> +                goto continue_search;
> So this is going to make us wake early. You remember that all this
> code exists solely for power saving purposes? Iiuc the CPU will
> then basically spin entering an idle state, until its actual wakeup
> time is reached.

What would you suggest instead?  We dont really want to be waking up late.

>
>> +
>> +            /* We can deal with being woken with 50us to spare */
>> +            if ( ch->next_event <= deadline )
>> +                break;
>> +
>> +            /* Otherwise record our best HPET to borrow. */
>> +            if ( ch->next_event <= fallback_deadline )
>> +            {
>> +                fallback_idx = i;
>> +                fallback_deadline = ch->next_event;
>> +            }
>> +
>> +        continue_search:
>> +            spin_unlock(&ch->lock);
>> +            ch = NULL;
>> +        }
>> +
>> +        if ( ch )
>> +        {
>> +            /* Found HPET with an appropriate time.  Request to be woken up */
>> +            cpumask_set_cpu(cpu, ch->cpumask);
>> +            this_cpu(hpet_channel) = ch;
>> +            spin_unlock(&ch->lock);
>> +        }
>> +        else if ( fallback_deadline < STIME_MAX && fallback_deadline != -1 )
>> +        {
>> +            /*
>> +             * Else we want to reprogram the fallback HPET sooner if possible,
>> +             * but with all the spinlocking, it just might have passed.
>> +             */
>> +            ch = &hpet_events[fallback_idx];
>> +
>> +            spin_lock(&ch->lock);
>>  
>> -    if ( !(ch->flags & HPET_EVT_LEGACY) )
>> -        hpet_attach_channel(cpu, ch);
>> +            if ( ch->cpu != -1 && ch->next_event == fallback_deadline )
> Can't this be "<= deadline", being quite a bit more flexible?

This is a test for whether ch is the one I identified as the best inside
the loop.  There should be sufficient flexibility inside the loop.

>
>> +            {
>> +                cpumask_set_cpu(cpu, ch->cpumask);
>> +                hpet_program_time(ch, deadline, NOW(), 1);
>> +            }
>> +            else
>> +                /* All else has failed.  Wander the idle loop again */
>> +                this_cpu(timer_deadline) = NOW() - 1;
>> +
>> +            spin_unlock(&ch->lock);
>> +        }
>> +        else
>> +            /* All HPETs were too soon.  Wander the idle loop again */
>> +            this_cpu(timer_deadline) = NOW() - 1;
> Here and above - what good will this do? Is this just in the
> expectation that the next time round a free channel will likely be
> found? If so, why can't you just go back to the start of the
> function.

Or a different HPET programmed to a different time which is now
compatible with our wakeup timeframe.

We cannot spin in this function, as we have interrupts disabled.

>
> Further - how do you see this going back to the idle loop? The
> way this is called from acpi/cpu_idle.c, you'll end up entering
> C3 anyway, with nothing going to wake you up. By proceeding to
> the end of the function, you even manage to disable the LAPIC
> timer.

Hmm - I will need to revisit this logic then.

>
>> +    /* If we own the channel, detach it */
>> +    if ( ch->cpu == cpu )
>> +    {
>> +        hpet_msi_mask(ch);
>> +        hpet_wake_cpus(ch);
>> +        ch->cpu = -1;
>> +        set_bit(ch->idx, &free_channels);
> Wouldn't there be less cache line bouncing if the "free" flag was just
> one of the channel flags?
>
> Jan

Yes, but then finding a free channel would involve searching each
hpet_channel rather than searching free_channels with ffs().

~Andrew

next prev parent reply	other threads:[~2013-11-22 16:23 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-13 17:59 [RFC v4 0/5] HPET fix interrupt logic Andrew Cooper
2013-11-13 17:59 ` [Patch v4 1/5] x86/hpet: Pre cleanup Andrew Cooper
2013-11-13 17:59 ` [Patch v4 2/5] x86/hpet: Use singe apic vector rather than irq_descs for HPET interrupts Andrew Cooper
2013-11-14 15:52   ` Tim Deegan
2013-11-14 15:56     ` Andrew Cooper
2013-11-14 16:01     ` [Patch v5 " Andrew Cooper
2013-11-22 15:45       ` Jan Beulich
2013-11-22 16:23         ` Andrew Cooper [this message]
2013-11-22 16:49           ` Jan Beulich
2013-11-22 17:38             ` Andrew Cooper
2013-11-25  7:52               ` Jan Beulich
2013-11-25  7:50           ` Jan Beulich
2013-11-26 18:32             ` Andrew Cooper
2013-11-27  8:35               ` Jan Beulich
2013-11-27 22:37                 ` Andrew Cooper
2013-11-28 14:33                   ` Jan Beulich
2013-11-28 15:06                     ` Andrew Cooper
2013-11-13 17:59 ` [Patch v4 3/5] x86/hpet: Post cleanup Andrew Cooper
2013-11-13 17:59 ` [Patch v4 4/5] x86/hpet: Debug and verbose hpet logging Andrew Cooper
2013-11-13 17:59 ` [Patch v4 5/5] x86/hpet: debug keyhandlers Andrew Cooper
  -- strict thread matches above, loose matches on Subject: below --
2014-03-05 15:43 [RFC v5 0/5] HPET fix interrupt logic Andrew Cooper
2014-03-05 15:43 ` [PATCH v5 2/5] x86/hpet: Use singe apic vector rather than irq_descs for HPET interrupts Andrew Cooper
2014-03-06 14:11   ` Tim Deegan
2014-03-06 14:33   ` Jan Beulich
2014-03-06 14:40     ` Andrew Cooper
2014-03-06 15:38       ` Jan Beulich
2014-03-06 16:08   ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=528F84FC.8000001@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=keir@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).