From: Paul Clarke <pc@us.ibm.com>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: paulmck@linux.vnet.ibm.com, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH] powerpc: mitigate impact of decrementer reset
Date: Mon, 10 Nov 2014 14:58:04 -0600 [thread overview]
Message-ID: <546126DC.6090909@us.ibm.com> (raw)
In-Reply-To: <1415614083.5769.18.camel@kernel.crashing.org>
On 11/10/2014 04:08 AM, Benjamin Herrenschmidt wrote:
> On Tue, 2014-10-07 at 14:13 -0500, Paul Clarke wrote:
>> The POWER ISA defines an always-running decrementer which can be used
>> to schedule interrupts after a certain time interval has elapsed.
>> The decrementer counts down at the same frequency as the Time Base,
>> which is 512 MHz. The maximum value of the decrementer is 0x7fffffff.
>> This works out to a maximum interval of about 4.19 seconds.
>>
>> If a larger interval is desired, the kernel will set the decrementer
>> to its maximum value and reset it after it expires (underflows)
>> a sufficient number of times until the desired interval has elapsed.
>>
>> The negative effect of this is that an unwanted latency spike will
>> impact normal processing at most every 4.19 seconds. On an IBM
>> POWER8-based system, this spike was measured at about 25-30
>> microseconds, much of which was basic, opportunistic housekeeping
>> tasks that could otherwise have waited.
>>
>> This patch short-circuits the reset of the decrementer, exiting after
>> the decrementer reset, but before the housekeeping tasks if the only
>> need for the interrupt is simply to reset it. After this patch,
>> the latency spike was measured at about 150 nanoseconds.
>
> Doesn't this break the irq_work stuff ? We trigger it with a set_dec(1);
> and your patch will probably cause it to be skipped...
You're right.
I'm confused by the division between timer_interrupt() and
__timer_interrupt(). The former is called with interrupts disabled (and
enables them), but also calls irq_enter()/irq_exit(). Why are those
calls not in __timer_interrupt()? (If they were, the short-circuit
logic might be a bit easier to put directly in __timer_interrupt(),
which would eliminate any duplicate code.)
It looks like __timer_interrupt is only called directly by the broadcast
timer IPI handler. (Why is __timer_interrupt not static?) Does this
path not need irq_enter/irq_exit?
>> Signed-off-by: Paul A. Clarke <pc@us.ibm.com>
>> ---
>> arch/powerpc/kernel/time.c | 13 +++++++++++++
>> 1 file changed, 13 insertions(+)
>>
>> diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
>> index 368ab37..962a06b 100644
>> --- a/arch/powerpc/kernel/time.c
>> +++ b/arch/powerpc/kernel/time.c
>> @@ -528,6 +528,7 @@ void timer_interrupt(struct pt_regs * regs)
>> {
>> struct pt_regs *old_regs;
>> u64 *next_tb = &__get_cpu_var(decrementers_next_tb);
>> + u64 now;
>>
>> /* Ensure a positive value is written to the decrementer, or else
>> * some CPUs will continue to take decrementer exceptions.
>> @@ -550,6 +551,18 @@ void timer_interrupt(struct pt_regs * regs)
>> */
>> may_hard_irq_enable();
>>
>> + /* If this is simply the decrementer expiring (underflow) due to
>> + * the limited size of the decrementer, and not a set timer,
>> + * reset (if needed) and return
>> + */
>> + now = get_tb_or_rtc();
>> + if (now < *next_tb) {
>> + now = *next_tb - now;
>> + if (now <= DECREMENTER_MAX)
>> + set_dec((int)now);
>> + __get_cpu_var(irq_stat).timer_irqs_others++;
>> + return;
>> + }
>>
>> #if defined(CONFIG_PPC32) && defined(CONFIG_PPC_PMAC)
>> if (atomic_read(&ppc_n_lost_interrupts) != 0)
Regards,
PC
next prev parent reply other threads:[~2014-11-10 20:58 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1412708517-84726-1-git-send-email-pc@us.ibm.com>
2014-10-07 19:13 ` [PATCH] powerpc: mitigate impact of decrementer reset Paul Clarke
2014-10-08 2:52 ` Michael Ellerman
2014-10-08 10:27 ` Preeti U Murthy
2014-11-05 17:06 ` Paul Clarke
2014-11-13 2:39 ` Michael Ellerman
2014-11-13 19:33 ` Paul Clarke
2014-10-08 5:37 ` [PATCH] " Heinz Wrobel
2014-10-08 12:27 ` Paul Clarke
2014-11-10 10:08 ` Benjamin Herrenschmidt
2014-11-10 20:58 ` Paul Clarke [this message]
2014-11-13 2:42 ` Michael Ellerman
2014-11-17 19:18 ` Paul E. McKenney
2014-11-18 1:46 ` Michael Ellerman
2014-11-18 3:08 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=546126DC.6090909@us.ibm.com \
--to=pc@us.ibm.com \
--cc=benh@kernel.crashing.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=paulmck@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.