From: shinya.kuribayashi.px@renesas.com (Shinya Kuribayashi)
To: linux-arm-kernel@lists.infradead.org
Subject: hrtimer: one more expiry time overflow check in hrtimer_interrupt
Date: Fri, 14 Jun 2013 15:18:53 +0900 [thread overview]
Message-ID: <51BAB5CD.2050608@renesas.com> (raw)
In-Reply-To: <51B867C5.2050007@redhat.com>
Hello,
On 6/12/2013 9:21 PM, Prarit Bhargava wrote:
> On 06/11/2013 03:41 AM, Shinya Kuribayashi wrote:
>> [ 27.857330] hrtimer: interrupt took 0 ns
>
> ^^^ see below ...
This may be because the frequency of our tick timer source in this
system is very slow, it's 32768 Hz. I think it's not suitable for
the high resolution timer at all, but we had no choice...
And with this patch applied,
> diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
> index cdd5607..bc37552 100644
> --- a/kernel/hrtimer.c
> +++ b/kernel/hrtimer.c
> @@ -1371,6 +1371,8 @@ retry:
> tick_program_event(expires_next, 1);
> printk_once(KERN_WARNING "hrtimer: interrupt took %llu ns\n",
> ktime_to_ns(delta));
> + printk_once(KERN_WARNING "# now=%lld entry_time=%lld\n", ktime_to_ns(now), ktime_to_ns(entry_time));
> + printk_once(KERN_WARNING "# expires_next=%lld\n", ktime_to_ns(expires_next));
> }
>
> /*
I've got this:
[ 22.857849] hrtimer: interrupt took 0 ns
[ 22.861755] # now=-4294967273343634023 entry_time=-4294967273343634023
[ 22.868286] # expires_next=-4294967273343634023
now == entry_time == expires_next, and delta == 0.
This means:
1) there haven't been no tick interrupts between 'entry_time' and 'now',
2) delta can be zero,
3) expires_next = ktime_add(now, delta) doesn't move expires_next forward.
Ok, it's simply hrtimer is not supposed to work with such a slow timer
source. However, this issue happens on other shmobile ARM systems with
more faster tick timer source (10--13MHz), see below.
>> @@ -1368,6 +1370,8 @@ retry:
>> expires_next = ktime_add_ns(now, 100 * NSEC_PER_MSEC);
>> else
>> expires_next = ktime_add(now, delta);
>> + if (expires_next.tv64 < 0)
>> + expires_next.tv64 = KTIME_MAX;
>
> Even with this change you will still see the warning below if delta = 0.
Correct.
>> tick_program_event(expires_next, 1);
>> printk_once(KERN_WARNING "hrtimer: interrupt took %llu ns\n",
>> ktime_to_ns(delta));
>
> So I'm not sure that this is the correct thing to do.
>
> Is this reproducible on any ARM system? I'll grab an x86_64 box and give it a
> shot there too. Can you dump the values of now, delta, and expires_next when
> the printk_once triggers?
It's 100% reproducible in our shmobile ARM kernels, v3.4 and v3.10-rc2.
And it doesn't reproduce with v2.6.35-based kernels so far.
Here're logs from Cortex-A15 UP/SMP systems. They're having ARM
architected tiemrs (the Generic Timers, running at 10--13MHz) using as
clocksource, clockevents, sched_clock, and read_current_timer (udelay)
sources.
-----------------------------------------------------------------------------
machine: R-Mobile APE6 (Cortex-A15 UP system - it's octo-core A15/A7 SoC, but not yet available in mainline)
kernel : v3.10-rc2 mainline
timer : the Generic Timers (arch/arm/kernel/arch_timer.c) running at 13MHz
-----------------------------------------------------------------------------
|# uname -a
|Linux arm 3.10.0-rc2-00966-gd658f9e #1284 Tue Jun 11 13:36:10 JST 2013 armv7l GNU/Linux
|# cat /proc/timer_list |egrep "Clock Event Device|event_handler"
|Clock Event Device: arch_sys_timer
| event_handler: hrtimer_interrupt
|# date -s "2038-1-19 3:14:00"
|Tue Jan 19 03:14:00 UTC 2038
|# [ 138.942325] ------------[ cut here ]------------
|[ 138.946910] WARNING: at kernel/time/clockevents.c:209 clockevents_program_event+0x130/0x148()
|[ 138.955315] Modules linked in:
|[ 138.958339] CPU: 0 PID: 0 Comm: swapper Tainted: G W 3.10.0-rc2-00966-gd658f9e #1284
|[...]
|[ 139.131970] ---[ end trace f879bec39a6bf85c ]---
|[ 139.136521] hrtimer: interrupt took 2385 ns
-----------------------------------------------------------------------------
machine: R-Car H2 (quad-core Cortex-A15 SMP system)
kernel : v3.4-based custom Android kernel
timer : the Generic Timers (arch/arm/kernel/arch_timer.c) running at 10MHz
-----------------------------------------------------------------------------
|root at renesas:~# date -s "2038-1-19 3:14:00"
|Tue Jan 19 03:14:00 GMT 2038
|root at renesas:~# [ 26.458377] ------------[ cut here ]------------
|[ 26.472214] WARNING: at kernel/time/clockevents.c:209 clockevents_program_event+0x3c/0x138()
|[...]
|[ 26.980342] ---[ end trace 5218f4fd6603f493 ]---
|[ 26.994166] hrtimer: interrupt took 1900 ns
|[ 27.006687] # now=-4294967269549995552 entry_time=-4294967269549997452
|[ 27.026240] # expires_next=-4294967269549993652
I also gave a try on my Ubuntu box, but it seems to work without
printk_once() at the bottome of hrtimer_interrupt() printed, which means
that hrtimer_update_base()-and-retries mechanism works as expected.
$ uname -a
Linux RI106878 3.2.0-45-generic #70-Ubuntu SMP Wed May 29 20:12:06 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
I'd like to see whether this happens on other ARM machines excpet for
shmobile. Could someone help, please?
--
Shinya Kuribayashi
Renesas Electronics
WARNING: multiple messages have this Message-ID (diff)
From: Shinya Kuribayashi <shinya.kuribayashi.px@renesas.com>
To: prarit@redhat.com
Cc: john.stultz@linaro.org, hiroyuki.yokoyama.vx@renesas.com,
takashi.yoshii.zj@renesas.com, linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org
Subject: Re: hrtimer: one more expiry time overflow check in hrtimer_interrupt
Date: Fri, 14 Jun 2013 15:18:53 +0900 [thread overview]
Message-ID: <51BAB5CD.2050608@renesas.com> (raw)
In-Reply-To: <51B867C5.2050007@redhat.com>
Hello,
On 6/12/2013 9:21 PM, Prarit Bhargava wrote:
> On 06/11/2013 03:41 AM, Shinya Kuribayashi wrote:
>> [ 27.857330] hrtimer: interrupt took 0 ns
>
> ^^^ see below ...
This may be because the frequency of our tick timer source in this
system is very slow, it's 32768 Hz. I think it's not suitable for
the high resolution timer at all, but we had no choice...
And with this patch applied,
> diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
> index cdd5607..bc37552 100644
> --- a/kernel/hrtimer.c
> +++ b/kernel/hrtimer.c
> @@ -1371,6 +1371,8 @@ retry:
> tick_program_event(expires_next, 1);
> printk_once(KERN_WARNING "hrtimer: interrupt took %llu ns\n",
> ktime_to_ns(delta));
> + printk_once(KERN_WARNING "# now=%lld entry_time=%lld\n", ktime_to_ns(now), ktime_to_ns(entry_time));
> + printk_once(KERN_WARNING "# expires_next=%lld\n", ktime_to_ns(expires_next));
> }
>
> /*
I've got this:
[ 22.857849] hrtimer: interrupt took 0 ns
[ 22.861755] # now=-4294967273343634023 entry_time=-4294967273343634023
[ 22.868286] # expires_next=-4294967273343634023
now == entry_time == expires_next, and delta == 0.
This means:
1) there haven't been no tick interrupts between 'entry_time' and 'now',
2) delta can be zero,
3) expires_next = ktime_add(now, delta) doesn't move expires_next forward.
Ok, it's simply hrtimer is not supposed to work with such a slow timer
source. However, this issue happens on other shmobile ARM systems with
more faster tick timer source (10--13MHz), see below.
>> @@ -1368,6 +1370,8 @@ retry:
>> expires_next = ktime_add_ns(now, 100 * NSEC_PER_MSEC);
>> else
>> expires_next = ktime_add(now, delta);
>> + if (expires_next.tv64 < 0)
>> + expires_next.tv64 = KTIME_MAX;
>
> Even with this change you will still see the warning below if delta = 0.
Correct.
>> tick_program_event(expires_next, 1);
>> printk_once(KERN_WARNING "hrtimer: interrupt took %llu ns\n",
>> ktime_to_ns(delta));
>
> So I'm not sure that this is the correct thing to do.
>
> Is this reproducible on any ARM system? I'll grab an x86_64 box and give it a
> shot there too. Can you dump the values of now, delta, and expires_next when
> the printk_once triggers?
It's 100% reproducible in our shmobile ARM kernels, v3.4 and v3.10-rc2.
And it doesn't reproduce with v2.6.35-based kernels so far.
Here're logs from Cortex-A15 UP/SMP systems. They're having ARM
architected tiemrs (the Generic Timers, running at 10--13MHz) using as
clocksource, clockevents, sched_clock, and read_current_timer (udelay)
sources.
-----------------------------------------------------------------------------
machine: R-Mobile APE6 (Cortex-A15 UP system - it's octo-core A15/A7 SoC, but not yet available in mainline)
kernel : v3.10-rc2 mainline
timer : the Generic Timers (arch/arm/kernel/arch_timer.c) running at 13MHz
-----------------------------------------------------------------------------
|# uname -a
|Linux arm 3.10.0-rc2-00966-gd658f9e #1284 Tue Jun 11 13:36:10 JST 2013 armv7l GNU/Linux
|# cat /proc/timer_list |egrep "Clock Event Device|event_handler"
|Clock Event Device: arch_sys_timer
| event_handler: hrtimer_interrupt
|# date -s "2038-1-19 3:14:00"
|Tue Jan 19 03:14:00 UTC 2038
|# [ 138.942325] ------------[ cut here ]------------
|[ 138.946910] WARNING: at kernel/time/clockevents.c:209 clockevents_program_event+0x130/0x148()
|[ 138.955315] Modules linked in:
|[ 138.958339] CPU: 0 PID: 0 Comm: swapper Tainted: G W 3.10.0-rc2-00966-gd658f9e #1284
|[...]
|[ 139.131970] ---[ end trace f879bec39a6bf85c ]---
|[ 139.136521] hrtimer: interrupt took 2385 ns
-----------------------------------------------------------------------------
machine: R-Car H2 (quad-core Cortex-A15 SMP system)
kernel : v3.4-based custom Android kernel
timer : the Generic Timers (arch/arm/kernel/arch_timer.c) running at 10MHz
-----------------------------------------------------------------------------
|root@renesas:~# date -s "2038-1-19 3:14:00"
|Tue Jan 19 03:14:00 GMT 2038
|root@renesas:~# [ 26.458377] ------------[ cut here ]------------
|[ 26.472214] WARNING: at kernel/time/clockevents.c:209 clockevents_program_event+0x3c/0x138()
|[...]
|[ 26.980342] ---[ end trace 5218f4fd6603f493 ]---
|[ 26.994166] hrtimer: interrupt took 1900 ns
|[ 27.006687] # now=-4294967269549995552 entry_time=-4294967269549997452
|[ 27.026240] # expires_next=-4294967269549993652
I also gave a try on my Ubuntu box, but it seems to work without
printk_once() at the bottome of hrtimer_interrupt() printed, which means
that hrtimer_update_base()-and-retries mechanism works as expected.
$ uname -a
Linux RI106878 3.2.0-45-generic #70-Ubuntu SMP Wed May 29 20:12:06 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
I'd like to see whether this happens on other ARM machines excpet for
shmobile. Could someone help, please?
--
Shinya Kuribayashi
Renesas Electronics
next prev parent reply other threads:[~2013-06-14 6:18 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-06-11 7:41 hrtimer: one more expiry time overflow check in hrtimer_interrupt Shinya Kuribayashi
2013-06-11 7:41 ` Shinya Kuribayashi
2013-06-12 12:21 ` Prarit Bhargava
2013-06-12 12:21 ` Prarit Bhargava
2013-06-14 6:18 ` Shinya Kuribayashi [this message]
2013-06-14 6:18 ` Shinya Kuribayashi
2013-06-28 12:22 ` Thomas Gleixner
2013-06-28 12:22 ` Thomas Gleixner
2013-07-05 11:50 ` Shinya Kuribayashi
2013-07-05 11:50 ` Shinya Kuribayashi
2013-07-05 13:19 ` Thomas Gleixner
2013-07-05 13:19 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51BAB5CD.2050608@renesas.com \
--to=shinya.kuribayashi.px@renesas.com \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.