public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Bug in hrtimer_get_next_event?
@ 2010-03-30 15:58 Gary King
  2010-03-31 10:43 ` Thomas Gleixner
  0 siblings, 1 reply; 3+ messages in thread
From: Gary King @ 2010-03-30 15:58 UTC (permalink / raw)
  To: linux-kernel@vger.kernel.org; +Cc: tglx@linutronix.de, Gary King

I am implementing idle state controls (CPU_IDLE) for Tegra SoCs, and one of the idle states is not awakened by the hrtimer interrupt. There is a system-wide high-resolution timer which can be used as a wakeup source, but I need the high-resolution sleep time to configure the alarm.

To fix this, I want to use hrtimer_get_next_event; however, the code that is in the tree only walks the hrtimer bases when hres mode is not active; when hres mode is active, hrtimer_get_next_event always returns KTIME_MAX. Is there any reason for the negative comparison, or is this a bug?

After changing this locally, I encountered one other problem on dynamic-tick systems: get_next_timer_interrupt is called to determine whether or not it is safe to enter nohz mode; however, hrtimer_get_next_event (which is used by get_next_timer_interrupt) will always return <=1 jiffy, since the emulated tick scheduler event will be armed when tick_nohz_stop_sched_tick queries the sleep time. As a result, tick_nohz_stop_sched_tick will never enter nohz mode.  I can think of a couple ways to address this (cancel the tick timer before querying the event and rearm if necessary from either the arch cpu_idle code or nohz_stop_sched_tick; ignore the tick timer in hrtimer_get_next_event); does anyone have a recommendation for a preferred approach?

- Gary
gking@nvidia.com
-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Bug in hrtimer_get_next_event?
  2010-03-30 15:58 Bug in hrtimer_get_next_event? Gary King
@ 2010-03-31 10:43 ` Thomas Gleixner
  2010-03-31 23:39   ` Gary King
  0 siblings, 1 reply; 3+ messages in thread
From: Thomas Gleixner @ 2010-03-31 10:43 UTC (permalink / raw)
  To: Gary King; +Cc: linux-kernel@vger.kernel.org

Gary,

please configure your mail client to do proper line breaks around 78
chars.

On Tue, 30 Mar 2010, Gary King wrote:

> I am implementing idle state controls (CPU_IDLE) for Tegra SoCs, and
> one of the idle states is not awakened by the hrtimer
> interrupt. There is a system-wide high-resolution timer which can be
> used as a wakeup source, but I need the high-resolution sleep time
> to configure the alarm.
>
> To fix this, I want to use hrtimer_get_next_event; however, the code
> that is in the tree only walks the hrtimer bases when hres mode is
> not active; when hres mode is active, hrtimer_get_next_event always
> returns KTIME_MAX. Is there any reason for the negative comparison,
> or is this a bug?
>
> After changing this locally, I encountered one other problem on
> dynamic-tick systems: get_next_timer_interrupt is called to
> determine whether or not it is safe to enter nohz mode; however,
> hrtimer_get_next_event (which is used by get_next_timer_interrupt)
> will always return <=1 jiffy, since the emulated tick scheduler
> event will be armed when tick_nohz_stop_sched_tick queries the sleep
> time. As a result, tick_nohz_stop_sched_tick will never enter nohz
> mode.  I can think of a couple ways to address this (cancel the tick
> timer before querying the event and rearm if necessary from either
> the arch cpu_idle code or nohz_stop_sched_tick; ignore the tick
> timer in hrtimer_get_next_event); does anyone have a recommendation
> for a preferred approach?

get_next_timer_interrupt() and hrtimer_get_next_event() are working
perfectly fine.

In the !HIGHRES case we get the next pending timer from both the timer
wheel and the hrtimer queue. Note that there is no tick timer in the
hrtimer queue, because the tick is generated periodically from the
NOHZ code.

In the HIGHRES case we replace the periodic tick by a hrtimer. We do
return KTIME_MAX in that case because we know from the clock event
when the next hrtimer is due. So the check works that way:

     query next timer wheel timer
     if that timer is due in the next jiffy, keep going

     if not, cancel the tick timer and rearm it to the next timer
     wheel interrupt. If there is any hrtimer pending _BEFORE_ the
     next timer wheel timer then the clock event is armed to that
     event anyway and not overridden by the modified tick timer.

Simply, do not use the hrtimer_get_next_event() and
get_next_timer_interrupt() for your purpose. They work only with the
tick management layer and are not designed for general purpose use.

If you want to know how far away the next timer event is, then use:

   tick_nohz_get_sleep_length()

That's going to tell you when the next timer interrupt will happen
when the system is idle. That works for high res off and on case.

Thanks,

	tglx


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Bug in hrtimer_get_next_event?
  2010-03-31 10:43 ` Thomas Gleixner
@ 2010-03-31 23:39   ` Gary King
  0 siblings, 0 replies; 3+ messages in thread
From: Gary King @ 2010-03-31 23:39 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: linux-kernel@vger.kernel.org

Thomas,

Thank-you for the explanation, the code is clear to me now. As you 
suggested, tick_nohz_get_sleep_length returns exactly what I want.

- Gary

On 03/31/2010 03:43 AM, Thomas Gleixner wrote:
> Gary,
>
> please configure your mail client to do proper line breaks around 78
> chars.
>
> On Tue, 30 Mar 2010, Gary King wrote:
>
>  > I am implementing idle state controls (CPU_IDLE) for Tegra SoCs, and
>  > one of the idle states is not awakened by the hrtimer
>  > interrupt. There is a system-wide high-resolution timer which can be
>  > used as a wakeup source, but I need the high-resolution sleep time
>  > to configure the alarm.
>  >
>  > To fix this, I want to use hrtimer_get_next_event; however, the code
>  > that is in the tree only walks the hrtimer bases when hres mode is
>  > not active; when hres mode is active, hrtimer_get_next_event always
>  > returns KTIME_MAX. Is there any reason for the negative comparison,
>  > or is this a bug?
>  >
>  > After changing this locally, I encountered one other problem on
>  > dynamic-tick systems: get_next_timer_interrupt is called to
>  > determine whether or not it is safe to enter nohz mode; however,
>  > hrtimer_get_next_event (which is used by get_next_timer_interrupt)
>  > will always return <=1 jiffy, since the emulated tick scheduler
>  > event will be armed when tick_nohz_stop_sched_tick queries the sleep
>  > time. As a result, tick_nohz_stop_sched_tick will never enter nohz
>  > mode. I can think of a couple ways to address this (cancel the tick
>  > timer before querying the event and rearm if necessary from either
>  > the arch cpu_idle code or nohz_stop_sched_tick; ignore the tick
>  > timer in hrtimer_get_next_event); does anyone have a recommendation
>  > for a preferred approach?
>
> get_next_timer_interrupt() and hrtimer_get_next_event() are working
> perfectly fine.
>
> In the !HIGHRES case we get the next pending timer from both the timer
> wheel and the hrtimer queue. Note that there is no tick timer in the
> hrtimer queue, because the tick is generated periodically from the
> NOHZ code.
>
> In the HIGHRES case we replace the periodic tick by a hrtimer. We do
> return KTIME_MAX in that case because we know from the clock event
> when the next hrtimer is due. So the check works that way:
>
> query next timer wheel timer
> if that timer is due in the next jiffy, keep going
>
> if not, cancel the tick timer and rearm it to the next timer
> wheel interrupt. If there is any hrtimer pending _BEFORE_ the
> next timer wheel timer then the clock event is armed to that
> event anyway and not overridden by the modified tick timer.
>
> Simply, do not use the hrtimer_get_next_event() and
> get_next_timer_interrupt() for your purpose. They work only with the
> tick management layer and are not designed for general purpose use.
>
> If you want to know how far away the next timer event is, then use:
>
> tick_nohz_get_sleep_length()
>
> That's going to tell you when the next timer interrupt will happen
> when the system is idle. That works for high res off and on case.
>
> Thanks,
>
> tglx
>
-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2010-03-31 23:39 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-30 15:58 Bug in hrtimer_get_next_event? Gary King
2010-03-31 10:43 ` Thomas Gleixner
2010-03-31 23:39   ` Gary King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox