From: Paul Clarke <pc@us.ibm.com>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: paulmck@linux.vnet.ibm.com, linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH] powerpc: mitigate impact of decrementer reset
Date: Mon, 10 Nov 2014 14:58:04 -0600 [thread overview]
Message-ID: <546126DC.6090909@us.ibm.com> (raw)
In-Reply-To: <1415614083.5769.18.camel@kernel.crashing.org>
On 11/10/2014 04:08 AM, Benjamin Herrenschmidt wrote:
> On Tue, 2014-10-07 at 14:13 -0500, Paul Clarke wrote:
>> The POWER ISA defines an always-running decrementer which can be used
>> to schedule interrupts after a certain time interval has elapsed.
>> The decrementer counts down at the same frequency as the Time Base,
>> which is 512 MHz. The maximum value of the decrementer is 0x7fffffff.
>> This works out to a maximum interval of about 4.19 seconds.
>>
>> If a larger interval is desired, the kernel will set the decrementer
>> to its maximum value and reset it after it expires (underflows)
>> a sufficient number of times until the desired interval has elapsed.
>>
>> The negative effect of this is that an unwanted latency spike will
>> impact normal processing at most every 4.19 seconds. On an IBM
>> POWER8-based system, this spike was measured at about 25-30
>> microseconds, much of which was basic, opportunistic housekeeping
>> tasks that could otherwise have waited.
>>
>> This patch short-circuits the reset of the decrementer, exiting after
>> the decrementer reset, but before the housekeeping tasks if the only
>> need for the interrupt is simply to reset it. After this patch,
>> the latency spike was measured at about 150 nanoseconds.
>
> Doesn't this break the irq_work stuff ? We trigger it with a set_dec(1);
> and your patch will probably cause it to be skipped...
You're right.
I'm confused by the division between timer_interrupt() and
__timer_interrupt(). The former is called with interrupts disabled (and
enables them), but also calls irq_enter()/irq_exit(). Why are those
calls not in __timer_interrupt()? (If they were, the short-circuit
logic might be a bit easier to put directly in __timer_interrupt(),
which would eliminate any duplicate code.)
It looks like __timer_interrupt is only called directly by the broadcast
timer IPI handler. (Why is __timer_interrupt not static?) Does this
path not need irq_enter/irq_exit?
>> Signed-off-by: Paul A. Clarke <pc@us.ibm.com>
>> ---
>> arch/powerpc/kernel/time.c | 13 +++++++++++++
>> 1 file changed, 13 insertions(+)
>>
>> diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
>> index 368ab37..962a06b 100644
>> --- a/arch/powerpc/kernel/time.c
>> +++ b/arch/powerpc/kernel/time.c
>> @@ -528,6 +528,7 @@ void timer_interrupt(struct pt_regs * regs)
>> {
>> struct pt_regs *old_regs;
>> u64 *next_tb = &__get_cpu_var(decrementers_next_tb);
>> + u64 now;
>>
>> /* Ensure a positive value is written to the decrementer, or else
>> * some CPUs will continue to take decrementer exceptions.
>> @@ -550,6 +551,18 @@ void timer_interrupt(struct pt_regs * regs)
>> */
>> may_hard_irq_enable();
>>
>> + /* If this is simply the decrementer expiring (underflow) due to
>> + * the limited size of the decrementer, and not a set timer,
>> + * reset (if needed) and return
>> + */
>> + now = get_tb_or_rtc();
>> + if (now < *next_tb) {
>> + now = *next_tb - now;
>> + if (now <= DECREMENTER_MAX)
>> + set_dec((int)now);
>> + __get_cpu_var(irq_stat).timer_irqs_others++;
>> + return;
>> + }
>>
>> #if defined(CONFIG_PPC32) && defined(CONFIG_PPC_PMAC)
>> if (atomic_read(&ppc_n_lost_interrupts) != 0)
Regards,
PC
next prev parent reply other threads:[~2014-11-10 20:58 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <1412708517-84726-1-git-send-email-pc@us.ibm.com>
2014-10-07 19:13 ` [PATCH] powerpc: mitigate impact of decrementer reset Paul Clarke
2014-10-08 2:52 ` Michael Ellerman
2014-10-08 10:27 ` Preeti U Murthy
2014-11-05 17:06 ` Paul Clarke
2014-11-13 2:39 ` Michael Ellerman
2014-11-13 19:33 ` Paul Clarke
2014-10-08 5:37 ` [PATCH] " Heinz Wrobel
2014-10-08 12:27 ` Paul Clarke
2014-11-10 10:08 ` Benjamin Herrenschmidt
2014-11-10 20:58 ` Paul Clarke [this message]
2014-11-13 2:42 ` Michael Ellerman
2014-11-17 19:18 ` Paul E. McKenney
2014-11-18 1:46 ` Michael Ellerman
2014-11-18 3:08 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=546126DC.6090909@us.ibm.com \
--to=pc@us.ibm.com \
--cc=benh@kernel.crashing.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=paulmck@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).