All of lore.kernel.org
 help / color / mirror / Atom feed
From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Jan Beulich <JBeulich@suse.com>
Cc: xen-devel <xen-devel@lists.xenproject.org>, Keir Fraser <keir@xen.org>
Subject: Re: [Patch v5 2/5] x86/hpet: Use singe apic vector rather than irq_descs for HPET interrupts
Date: Fri, 22 Nov 2013 17:38:08 +0000	[thread overview]
Message-ID: <528F9680.30501@citrix.com> (raw)
In-Reply-To: <528F993B0200007800106067@nat28.tlf.novell.com>

On 22/11/13 16:49, Jan Beulich wrote:
>>>> On 22.11.13 at 17:23, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>> On 22/11/13 15:45, Jan Beulich wrote:
>>>>>> On 14.11.13 at 17:01, Andrew Cooper <andrew.cooper3@citrix.com> wrote:
>>>> +/* Wake up all cpus in the channel mask.  Lock should be held. */
>>>> +static void hpet_wake_cpus(struct hpet_event_channel *ch)
>>>>  {
>>>> -    unsigned int cpu = smp_processor_id();
>>>> +    cpuidle_wakeup_mwait(ch->cpumask);
>>>>  
>>>> -    if ( cpumask_test_and_clear_cpu(cpu, mask) )
>>>> -        raise_softirq(TIMER_SOFTIRQ);
>>>> -
>>>> -    cpuidle_wakeup_mwait(mask);
>>>> -
>>>> -    if ( !cpumask_empty(mask) )
>>>> -       cpumask_raise_softirq(mask, TIMER_SOFTIRQ);
>>>> +    if ( !cpumask_empty(ch->cpumask) )
>>>> +       cpumask_raise_softirq(ch->cpumask, TIMER_SOFTIRQ);
>>> Looks like the cpumask_empty() check isn't really needed here?
>> cpuidle_wakeup_mwait(mask) will only wake cpus who have set their bit in
>> cpuidle_mwait_flags.
>>
>> There might be cpus who have registered for waking up who have not yet
>> set their bit in cpuidle_mwait_flags.
>>
>> Out of caution, I did by best to avoid playing with the guts of the
>> timing code, as it seems quite fragile.
> Some misunderstanding? The cpumask_empty() check doesn't
> guard the call to cpuidle_wakeup_mwait(), but the one to
> cpumask_raise_softirq() (which appears to be prepared to be
> called with an empty mask).

cpuidle_wakeup_mwait() clears bits out of mask.  The result is the set
of cpus who were not set in cpuidle_mwait_flags.

>
>> I would however prefer not to manually inline hpet_setup_msi() into
>> hpet_broadcast_enter().  The compiler can do that for me, and it saves
>> making hpet_broadcast_enter() even more complicated.
> Sure, and I wasn't suggesting that.
>
>>>> +        cpumask_set_cpu(cpu, ch->cpumask);
>>>> +
>>>> +        hpet_setup_msi(ch);
>>>> +        hpet_program_time(ch, deadline, NOW(), 1);
>>>> +        hpet_msi_unmask(ch);
>>>> +
>>>> +        spin_unlock(&ch->lock);
>>>> +    }
>>>> +    else
>>>> +    {
>>>> +        /* TODO - this seems very ugly */
>>> And you nevertheless want it committed this way?
>> Probably best the comment gets dropped.
> But is _does_ seem very ugly, so the comment is true.
>
>>>> +        s_time_t fallback_deadline = STIME_MAX;
>>>> +        unsigned int i, fallback_idx = -1;
>>>> +
>>>> +        for ( i = 0; i < num_hpets_used; ++i )
>>>> +        {
>>>> +            ch = &hpet_events[i];
>>>> +            spin_lock(&ch->lock);
>>>> +
>>>> +            if ( ch->cpu == -1 )
>>>> +                goto continue_search;
>>>> +
>>>> +            /* This channel is going to expire far too early */
>>>> +            if ( ch->next_event < deadline - MICROSECS(50) )
>>>> +                goto continue_search;
>>> So this is going to make us wake early. You remember that all this
>>> code exists solely for power saving purposes? Iiuc the CPU will
>>> then basically spin entering an idle state, until its actual wakeup
>>> time is reached.
>> What would you suggest instead?  We dont really want to be waking up late.
> In fact we're always kind of waking up late, due to the processing
> time it takes to come back out of the C state. If someone heavily
> favors responsiveness (i.e. low wakeup latency) over power
> savings, (s)he should disallow the use of deep C states.
>
>>>> +            if ( ch->cpu != -1 && ch->next_event == fallback_deadline )
>>> Can't this be "<= deadline", being quite a bit more flexible?
>> This is a test for whether ch is the one I identified as the best inside
>> the loop.  There should be sufficient flexibility inside the loop.
> The thing is that with the lock dropped, ch->next_event may have
> changed. But it would still suit us if <= deadline.

Very true - I missed that.

>
>>>> +            else
>>>> +                /* All else has failed.  Wander the idle loop again */
>>>> +                this_cpu(timer_deadline) = NOW() - 1;
>>>> +
>>>> +            spin_unlock(&ch->lock);
>>>> +        }
>>>> +        else
>>>> +            /* All HPETs were too soon.  Wander the idle loop again */
>>>> +            this_cpu(timer_deadline) = NOW() - 1;
>>> Here and above - what good will this do? Is this just in the
>>> expectation that the next time round a free channel will likely be
>>> found? If so, why can't you just go back to the start of the
>>> function.
>> Or a different HPET programmed to a different time which is now
>> compatible with our wakeup timeframe.
> Are there cases where the wakeup time gets moved backwards
> (i.e. less far in the future)? I can't seem to immediately think of
> such a case.

The case I was expecting is an HPET about to fire and being relinquished
by its owner.  After an iteration of the idle loop it might be free to
nab, or a different cpu has nabbed it and programmed it for a reasonable
time into the future.

~Andrew

>
>> We cannot spin in this function, as we have interrupts disabled.
> And I was suggesting this only if the failure condition would provide
> a positive sign of there some suitable resource having become
> available.
>
> Jan

  reply	other threads:[~2013-11-22 17:38 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-13 17:59 [RFC v4 0/5] HPET fix interrupt logic Andrew Cooper
2013-11-13 17:59 ` [Patch v4 1/5] x86/hpet: Pre cleanup Andrew Cooper
2013-11-13 17:59 ` [Patch v4 2/5] x86/hpet: Use singe apic vector rather than irq_descs for HPET interrupts Andrew Cooper
2013-11-14 15:52   ` Tim Deegan
2013-11-14 15:56     ` Andrew Cooper
2013-11-14 16:01     ` [Patch v5 " Andrew Cooper
2013-11-22 15:45       ` Jan Beulich
2013-11-22 16:23         ` Andrew Cooper
2013-11-22 16:49           ` Jan Beulich
2013-11-22 17:38             ` Andrew Cooper [this message]
2013-11-25  7:52               ` Jan Beulich
2013-11-25  7:50           ` Jan Beulich
2013-11-26 18:32             ` Andrew Cooper
2013-11-27  8:35               ` Jan Beulich
2013-11-27 22:37                 ` Andrew Cooper
2013-11-28 14:33                   ` Jan Beulich
2013-11-28 15:06                     ` Andrew Cooper
2013-11-13 17:59 ` [Patch v4 3/5] x86/hpet: Post cleanup Andrew Cooper
2013-11-13 17:59 ` [Patch v4 4/5] x86/hpet: Debug and verbose hpet logging Andrew Cooper
2013-11-13 17:59 ` [Patch v4 5/5] x86/hpet: debug keyhandlers Andrew Cooper
  -- strict thread matches above, loose matches on Subject: below --
2014-03-05 15:43 [RFC v5 0/5] HPET fix interrupt logic Andrew Cooper
2014-03-05 15:43 ` [PATCH v5 2/5] x86/hpet: Use singe apic vector rather than irq_descs for HPET interrupts Andrew Cooper
2014-03-06 14:11   ` Tim Deegan
2014-03-06 14:33   ` Jan Beulich
2014-03-06 14:40     ` Andrew Cooper
2014-03-06 15:38       ` Jan Beulich
2014-03-06 16:08   ` Jan Beulich

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=528F9680.30501@citrix.com \
    --to=andrew.cooper3@citrix.com \
    --cc=JBeulich@suse.com \
    --cc=keir@xen.org \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.