All of lore.kernel.org
 help / color / mirror / Atom feed
From: Preeti U Murthy <preeti@linux.vnet.ibm.com>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: paulmck@linux.vnet.ibm.com, paulus@samba.org,
	Anton Blanchard <anton@samba.org>,
	linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH] powerpc: irq work racing with timer interrupt can result in timer interrupt hang
Date: Sun, 11 May 2014 13:45:14 +0530	[thread overview]
Message-ID: <536F3192.2050004@linux.vnet.ibm.com> (raw)
In-Reply-To: <1399760754.17624.21.camel@pasglop>

On 05/11/2014 03:55 AM, Benjamin Herrenschmidt wrote:
> On Sat, 2014-05-10 at 21:06 +0530, Preeti U Murthy wrote:
>> On 05/10/2014 09:56 AM, Benjamin Herrenschmidt wrote:
>>> On Fri, 2014-05-09 at 15:22 +0530, Preeti U Murthy wrote:
>>>> in __timer_interrupt() outside the _else_ loop? This will ensure that no
>>>> matter what, before exiting timer interrupt handler we check for pending
>>>> irq work.
>>>
>>> We still need to make sure that set_next_event() doesn't move the
>>> dec beyond the next tick if there is a pending timer... maybe we
>>
>> Sorry, but didn't get this. s/if there is pending timer/if there is
>> pending irq work ?
> 
> Yes, sorry :-) That's what I meant.
> 
>>> can fix it like this:
>>
>> We can call set_next_event() from events like hrtimer_cancel() or
>> hrtimer_forward() as well. In that case we don't come to
>> decrementer_set_next_event() from __timer_interrupt(). Then, if we race
>> with irq work, we *do not do* a set_dec(1) ( I am referring to the patch
>> below ), we might never set the decrementer to fire immediately right?
>>
>> Or does this scenario never arise?
> 
> So my proposed patch handles that no ?
> 
> With that patch, we do the set_dec(1) in two cases:
> 
>  - The existing arch_irq_work_raise() which is unchanged
> 
>  - At the end of __timer_interrupt() if an irq work is still pending
> 
> And the patch also makes decrementer_set_next_event() not modify the
> decrementer if an irq work is pending, but *still* adjust next_tb unlike
> what the code does now.
> 
> Thus the timer interrupt, when it happens, will re-adjust the dec
> properly using next_tb.
> 
> Do we still miss a case ?

I was thinking something like the below in decrementer_set_next_event().
See last line in particular :

 -       /* Don't adjust the decrementer if some irq work is pending */
 -       if (test_irq_work_pending())
 -               return 0;
         __get_cpu_var(decrementers_next_tb) = get_tb_or_rtc() + evt;
 -       set_dec(evt);

 -       /* We may have raced with new irq work */
 -       if (test_irq_work_pending())
 -               set_dec(1);
 +       /* Don't adjust the decrementer if some irq work is pending */
 +       if (!test_irq_work_pending())
 +               set_dec(evt);
 +       else
 +               set_dec(1);

                  ^^^^^ your patch currently does not have this explicit
set_dec(1) here. Will that create a problem? If there is any irq work
pending at this point, will someone set the decrementer to fire
immediately after this point? The current code in
decrementer_set_next_event() sets set_dec(1) explicitly in case of
pending irq work.

Regards
Preeti U Murthy
> 
> Cheers,
> Ben.
> 
>> Regards
>> Preeti U Murthy
>>>
>>> static int decrementer_set_next_event(unsigned long evt,
>>> 				      struct clock_event_device *dev)
>>> {
>>> 	__get_cpu_var(decrementers_next_tb) = get_tb_or_rtc() + evt;
>>>
>>> 	/* Don't adjust the decrementer if some irq work is pending */
>>> 	if (!test_irq_work_pending())
>>> 		set_dec(evt);
>>>
>>> 	return 0;
>>> }
>>>
>>> Along with a single occurrence of:
>>>
>>> 	if (test_irq_work_pending())
>>> 		set_dec(1);
>>>
>>> At the end of __timer_interrupt(), outside if the current else {}
>>> case, this should work, don't you think ?
>>>
>>> What about this completely untested patch ?
>>>
>>> diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
>>> index 122a580..ba7e83b 100644
>>> --- a/arch/powerpc/kernel/time.c
>>> +++ b/arch/powerpc/kernel/time.c
>>> @@ -503,12 +503,13 @@ void __timer_interrupt(void)
>>>                 now = *next_tb - now;
>>>                 if (now <= DECREMENTER_MAX)
>>>                         set_dec((int)now);
>>> -               /* We may have raced with new irq work */
>>> -               if (test_irq_work_pending())
>>> -                       set_dec(1);
>>>                 __get_cpu_var(irq_stat).timer_irqs_others++;
>>>         }
>>>
>>> +       /* We may have raced with new irq work */
>>> +       if (test_irq_work_pending())
>>> +               set_dec(1);
>>> +
>>>  #ifdef CONFIG_PPC64
>>>         /* collect purr register values often, for accurate calculations */
>>>         if (firmware_has_feature(FW_FEATURE_SPLPAR)) {
>>> @@ -813,15 +814,11 @@ static void __init clocksource_init(void)
>>>  static int decrementer_set_next_event(unsigned long evt,
>>>                                       struct clock_event_device *dev)
>>>  {
>>> -       /* Don't adjust the decrementer if some irq work is pending */
>>> -       if (test_irq_work_pending())
>>> -               return 0;
>>>         __get_cpu_var(decrementers_next_tb) = get_tb_or_rtc() + evt;
>>> -       set_dec(evt);
>>>
>>> -       /* We may have raced with new irq work */
>>> -       if (test_irq_work_pending())
>>> -               set_dec(1);
>>> +       /* Don't adjust the decrementer if some irq work is pending */
>>> +       if (!test_irq_work_pending())
>>> +               set_dec(evt);
>>>
>>>         return 0;
>>>  }
>>>
>>>
>>>
>>>
> 
> 

  reply	other threads:[~2014-05-11  8:19 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-05-09  7:47 [PATCH] powerpc: irq work racing with timer interrupt can result in timer interrupt hang Anton Blanchard
2014-05-09  9:52 ` Preeti U Murthy
2014-05-10  4:26   ` Benjamin Herrenschmidt
2014-05-10 15:36     ` Preeti U Murthy
2014-05-10 22:25       ` Benjamin Herrenschmidt
2014-05-11  8:15         ` Preeti U Murthy [this message]
2014-05-11  8:37           ` Benjamin Herrenschmidt
2014-05-11  8:43             ` Preeti U Murthy
2014-05-11  9:03               ` Benjamin Herrenschmidt
2014-05-11  9:07                 ` Preeti U Murthy
2014-05-09 13:41 ` Paul E. McKenney
2014-05-09 21:50   ` Gabriel Paubert
2014-05-09 22:08     ` Paul E. McKenney
2014-05-10  6:33       ` Paul Mackerras
2014-05-10 16:33         ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=536F3192.2050004@linux.vnet.ibm.com \
    --to=preeti@linux.vnet.ibm.com \
    --cc=anton@samba.org \
    --cc=benh@kernel.crashing.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=paulus@samba.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.