From: bert schulze <spambemyguest@googlemail.com>
To: linux-rt-users@vger.kernel.org
Subject: 4.14-rt timer issues using PREEMPT_RT_FULL=y and NO_HZ_FULL_ALL=y
Date: Tue, 12 Dec 2017 22:58:18 +0100 [thread overview]
Message-ID: <20171212215818.GA18168@a.fritz.box> (raw)
Hi folks,
I'm having issues with v4.14-rt1 to v4.14.3-rt5 using NO_HZ_FULL_ALL=y
with PREEMPT_RT_FULL=y and kernel.timer_migration enabled (which seems
to be enabled by default).
Full config used: http://paste.debian.net/hidden/eb51a120/
The kernel either boots fine or may lock up on boot already (sysrq is
working still and boot continues after some seconds upto minutes).
If any hang occurred on boot dmesg will contain:
root@deb9:~# dmesg | grep hrtimer
[ 1.507207] hrtimer: interrupt took 28740 ns
If the system booted up fine (-> no "interrupt took #### ns" message)
it behaves as expected as long as timer migration was disabled.
root@deb9:~# echo 0 > /proc/sys/kernel/timer_migration
A simple sleep (or anything else using nanosleep() is sufficient to
reproduce this.
The expected behaviour with kernel.timer_migration = 0
root@deb9:~# grep LOC: /proc/interrupts
LOC: 91968 801 775 590 Local timer interrupts
root@deb9:~# for cpu in {0..3} ;do time taskset -ac $cpu sleep 0.1 ;done
real 0m0.104s // CPU0 ok
real 0m0.104s // CPU1 ok
real 0m0.104s // CPU2 ok
real 0m0.105s // CPU3 ok
root@deb9:~# grep LOC: /proc/interrupts
LOC: 101069 824 782 599 Local timer interrupts
Roughly 10 seconds passed and the housekeeping cpu shows ~10.000 timer
interrupts (which matches up with CONFIG_HZ=1000).
Doing the same with kernel.timer_migration = 1
root@deb9:~# for cpu in {0..3} ;do time taskset -ac $cpu sleep 0.1 ;done
real 0m0.104s // CPU0 ok
[ 125.282455] hrtimer: interrupt took 2230 ns <--
real 0m28.023s // CPU1 not ok
real 0m9.129s // CPU2 not ok
real 0m10.000s // CPU3 not ok
The hrtimer: "interrupt took #### ns" message appeared any sleep on the
adaptive-tick cpu are completely off and …
root@deb9:~# grep LOC: /proc/interrupts
LOC: 12544410 874 828 638 Local timer interrupts
… timer interrupts on the housekeeping cpu advanced by ~12400000 after
roughly 60 seconds even though the system is up for 2 minutes.
root@deb9:~# uptime
21:37:14 up 2 min, 1 user, load average: 0.17, 0.15, 0.06
To rule out my hardware I've successfully reproduced this on i7-6700,
i7-3517u, i7-2xxxHQ hardware as well as in QEMU itself.
Everything is back to normal by passing "nohz_full=" to the kernel to
disable adaptive-tick cpus.
I've furthermore tested v4.13.13-rt5 and WIP.timers branch of tip.git
and both of them are working as expected.
Thanks,
Bert
next reply other threads:[~2017-12-12 21:58 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-12 21:58 bert schulze [this message]
2017-12-13 17:53 ` 4.14-rt timer issues using PREEMPT_RT_FULL=y and NO_HZ_FULL_ALL=y Sebastian Andrzej Siewior
2017-12-13 19:14 ` bert schulze
2017-12-14 16:25 ` Sebastian Andrzej Siewior
2017-12-14 18:01 ` bert schulze
2017-12-14 18:27 ` Sebastian Andrzej Siewior
2017-12-14 20:57 ` bert schulze
2017-12-15 18:22 ` Sebastian Andrzej Siewior
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171212215818.GA18168@a.fritz.box \
--to=spambemyguest@googlemail.com \
--cc=linux-rt-users@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox