From: Preeti U Murthy <preeti@linux.vnet.ibm.com>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: paulmck@linux.vnet.ibm.com, paulus@samba.org,
Anton Blanchard <anton@samba.org>,
linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH] powerpc: irq work racing with timer interrupt can result in timer interrupt hang
Date: Sun, 11 May 2014 13:45:14 +0530 [thread overview]
Message-ID: <536F3192.2050004@linux.vnet.ibm.com> (raw)
In-Reply-To: <1399760754.17624.21.camel@pasglop>
On 05/11/2014 03:55 AM, Benjamin Herrenschmidt wrote:
> On Sat, 2014-05-10 at 21:06 +0530, Preeti U Murthy wrote:
>> On 05/10/2014 09:56 AM, Benjamin Herrenschmidt wrote:
>>> On Fri, 2014-05-09 at 15:22 +0530, Preeti U Murthy wrote:
>>>> in __timer_interrupt() outside the _else_ loop? This will ensure that no
>>>> matter what, before exiting timer interrupt handler we check for pending
>>>> irq work.
>>>
>>> We still need to make sure that set_next_event() doesn't move the
>>> dec beyond the next tick if there is a pending timer... maybe we
>>
>> Sorry, but didn't get this. s/if there is pending timer/if there is
>> pending irq work ?
>
> Yes, sorry :-) That's what I meant.
>
>>> can fix it like this:
>>
>> We can call set_next_event() from events like hrtimer_cancel() or
>> hrtimer_forward() as well. In that case we don't come to
>> decrementer_set_next_event() from __timer_interrupt(). Then, if we race
>> with irq work, we *do not do* a set_dec(1) ( I am referring to the patch
>> below ), we might never set the decrementer to fire immediately right?
>>
>> Or does this scenario never arise?
>
> So my proposed patch handles that no ?
>
> With that patch, we do the set_dec(1) in two cases:
>
> - The existing arch_irq_work_raise() which is unchanged
>
> - At the end of __timer_interrupt() if an irq work is still pending
>
> And the patch also makes decrementer_set_next_event() not modify the
> decrementer if an irq work is pending, but *still* adjust next_tb unlike
> what the code does now.
>
> Thus the timer interrupt, when it happens, will re-adjust the dec
> properly using next_tb.
>
> Do we still miss a case ?
I was thinking something like the below in decrementer_set_next_event().
See last line in particular :
- /* Don't adjust the decrementer if some irq work is pending */
- if (test_irq_work_pending())
- return 0;
__get_cpu_var(decrementers_next_tb) = get_tb_or_rtc() + evt;
- set_dec(evt);
- /* We may have raced with new irq work */
- if (test_irq_work_pending())
- set_dec(1);
+ /* Don't adjust the decrementer if some irq work is pending */
+ if (!test_irq_work_pending())
+ set_dec(evt);
+ else
+ set_dec(1);
^^^^^ your patch currently does not have this explicit
set_dec(1) here. Will that create a problem? If there is any irq work
pending at this point, will someone set the decrementer to fire
immediately after this point? The current code in
decrementer_set_next_event() sets set_dec(1) explicitly in case of
pending irq work.
Regards
Preeti U Murthy
>
> Cheers,
> Ben.
>
>> Regards
>> Preeti U Murthy
>>>
>>> static int decrementer_set_next_event(unsigned long evt,
>>> struct clock_event_device *dev)
>>> {
>>> __get_cpu_var(decrementers_next_tb) = get_tb_or_rtc() + evt;
>>>
>>> /* Don't adjust the decrementer if some irq work is pending */
>>> if (!test_irq_work_pending())
>>> set_dec(evt);
>>>
>>> return 0;
>>> }
>>>
>>> Along with a single occurrence of:
>>>
>>> if (test_irq_work_pending())
>>> set_dec(1);
>>>
>>> At the end of __timer_interrupt(), outside if the current else {}
>>> case, this should work, don't you think ?
>>>
>>> What about this completely untested patch ?
>>>
>>> diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
>>> index 122a580..ba7e83b 100644
>>> --- a/arch/powerpc/kernel/time.c
>>> +++ b/arch/powerpc/kernel/time.c
>>> @@ -503,12 +503,13 @@ void __timer_interrupt(void)
>>> now = *next_tb - now;
>>> if (now <= DECREMENTER_MAX)
>>> set_dec((int)now);
>>> - /* We may have raced with new irq work */
>>> - if (test_irq_work_pending())
>>> - set_dec(1);
>>> __get_cpu_var(irq_stat).timer_irqs_others++;
>>> }
>>>
>>> + /* We may have raced with new irq work */
>>> + if (test_irq_work_pending())
>>> + set_dec(1);
>>> +
>>> #ifdef CONFIG_PPC64
>>> /* collect purr register values often, for accurate calculations */
>>> if (firmware_has_feature(FW_FEATURE_SPLPAR)) {
>>> @@ -813,15 +814,11 @@ static void __init clocksource_init(void)
>>> static int decrementer_set_next_event(unsigned long evt,
>>> struct clock_event_device *dev)
>>> {
>>> - /* Don't adjust the decrementer if some irq work is pending */
>>> - if (test_irq_work_pending())
>>> - return 0;
>>> __get_cpu_var(decrementers_next_tb) = get_tb_or_rtc() + evt;
>>> - set_dec(evt);
>>>
>>> - /* We may have raced with new irq work */
>>> - if (test_irq_work_pending())
>>> - set_dec(1);
>>> + /* Don't adjust the decrementer if some irq work is pending */
>>> + if (!test_irq_work_pending())
>>> + set_dec(evt);
>>>
>>> return 0;
>>> }
>>>
>>>
>>>
>>>
>
>
next prev parent reply other threads:[~2014-05-11 8:19 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-05-09 7:47 [PATCH] powerpc: irq work racing with timer interrupt can result in timer interrupt hang Anton Blanchard
2014-05-09 9:52 ` Preeti U Murthy
2014-05-10 4:26 ` Benjamin Herrenschmidt
2014-05-10 15:36 ` Preeti U Murthy
2014-05-10 22:25 ` Benjamin Herrenschmidt
2014-05-11 8:15 ` Preeti U Murthy [this message]
2014-05-11 8:37 ` Benjamin Herrenschmidt
2014-05-11 8:43 ` Preeti U Murthy
2014-05-11 9:03 ` Benjamin Herrenschmidt
2014-05-11 9:07 ` Preeti U Murthy
2014-05-09 13:41 ` Paul E. McKenney
2014-05-09 21:50 ` Gabriel Paubert
2014-05-09 22:08 ` Paul E. McKenney
2014-05-10 6:33 ` Paul Mackerras
2014-05-10 16:33 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=536F3192.2050004@linux.vnet.ibm.com \
--to=preeti@linux.vnet.ibm.com \
--cc=anton@samba.org \
--cc=benh@kernel.crashing.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=paulmck@linux.vnet.ibm.com \
--cc=paulus@samba.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).