* kernel freeze when running docker
@ 2018-08-20 9:35 Vignesh Raman
[not found] ` <20180821012731.GB8562@jcartwri.amer.corp.natinst.com>
0 siblings, 1 reply; 5+ messages in thread
From: Vignesh Raman @ 2018-08-20 9:35 UTC (permalink / raw)
To: linux-rt-users
Hi,
I'm seeing a kernel freeze (and systemd reboots after few seconds)
when docker running kubernetes with debian 4.9 RT kernel
(4.9.0-6-rt-amd64).
This was tested with 4.14 -rt kernel and the issue was still seen.
kdump is not capturing the coredump in this case so I'm unable to
check the crash logs.
The system crash logs are not available in /sys/fs/pstore.
When manual crash is triggered with sysrq the kdump captures the crash
logs. But this does not work always.
Sometimes the system does not reboot when manual crash (sysrq) is
triggered and a freeze happens.
Maybe the capture kernel itself hangs while taking coredump.
When manual crash is triggered with sysrq the crash logs are available
in /sys/fs/pstore.
I'm able to reproduce a similar freeze/reset issue with the steps
mentioned here https://github.com/moby/moby/issues/19758.
Run "for f in $(seq 1 1000);do docker run -it --rm ubuntu echo $f;
done" concurrently in 3 terminals.
But I'm unable to confirm if both these issues are related due to
unavailability of crash logs.
Enabled CPUAccounting/MemoryAccounting in docker.service but the
system freeze was still seen.,
sudo systemctl set-property docker.service MemoryAccounting=yes
sudo systemctl set-property docker.service CPUAccounting=yes
I was thinking to enable CONFIG_RT_GROUP_SCHED (allocate real CPU
bandwidth to task groups). But this is disabled if PREEMPT_RT_FULL is
enabled.
I'm unable to root cause the reason for freeze/crash because crash
logs are not available.
Is there any other way to capture the crashdump? Any pointers will be
helpful. Thanks.
Regards,
Vignesh
^ permalink raw reply [flat|nested] 5+ messages in thread[parent not found: <20180821012731.GB8562@jcartwri.amer.corp.natinst.com>]
[parent not found: <CAH3OF50F26ywGHZ2Rfa7v-SeoZ6KccSPt+1uNq65eBNTqom=-A@mail.gmail.com>]
* Re: kernel freeze when running docker [not found] ` <CAH3OF50F26ywGHZ2Rfa7v-SeoZ6KccSPt+1uNq65eBNTqom=-A@mail.gmail.com> @ 2018-08-21 13:42 ` Julia Cartwright 2018-08-22 8:08 ` Vignesh Raman 0 siblings, 1 reply; 5+ messages in thread From: Julia Cartwright @ 2018-08-21 13:42 UTC (permalink / raw) To: Vignesh Raman; +Cc: linux-rt-users Hello Vignesh- On Tue, Aug 21, 2018 at 07:03:20PM +0530, Vignesh Raman wrote: > Hi Julia, > > > On Tue, Aug 21, 2018 at 6:57 AM Julia Cartwright <julia@ni.com> wrote: > > Serial console is really the best way to go, especially if the higher > > level tools (crash, pstore, etc.) aren't functioning properly. With the > > hopes that you can get an oops or panic message... > > Added 'debug systemd.journald.forward_to_console=1 pause_on_oops=120' to > kernel boot parameters. When the freeze/reset happens with the docker test > case, the kernel oops/panic is not seen in the serial console. That's annoying. When you are in this state can you trigger a dump via the magic sysrq sequence over serial? Have you tried enabling the various lockup detectors? There is a software-driven one, and an NMI watchdog based one. Enabling both at the same time should be fine. Julia ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: kernel freeze when running docker 2018-08-21 13:42 ` Julia Cartwright @ 2018-08-22 8:08 ` Vignesh Raman [not found] ` <20180822164235.GR1123@jcartwri.amer.corp.natinst.com> 0 siblings, 1 reply; 5+ messages in thread From: Vignesh Raman @ 2018-08-22 8:08 UTC (permalink / raw) To: Julia Cartwright; +Cc: linux-rt-users Hi Julia, On Tuesday 21 August 2018 07:12 PM, Julia Cartwright wrote: > That's annoying. When you are in this state can you trigger a dump via > the magic sysrq sequence over serial? No. I'm not able to trigger a crash using sysrq (Alt+SysRq+c). Keyboard 'caps lock' led light does not respond when caps lock key is pressed. Does it indicate a softlockup or hardlockup ? > Have you tried enabling the various lockup detectors? There is a > software-driven one, and an NMI watchdog based one. Enabling both > at the same time should be fine. The below config options are enabled, CONFIG_HAVE_HARDLOCKUP_DETECTOR_PERF=y # Debug Lockups and Hangs CONFIG_LOCKUP_DETECTOR=y CONFIG_SOFTLOCKUP_DETECTOR=y CONFIG_HARDLOCKUP_DETECTOR_PERF=y CONFIG_HARDLOCKUP_CHECK_TIMESTAMP=y CONFIG_HARDLOCKUP_DETECTOR=y # CONFIG_BOOTPARAM_HARDLOCKUP_PANIC is not set CONFIG_BOOTPARAM_HARDLOCKUP_PANIC_VALUE=0 # CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=0 CONFIG_HAVE_NMI=y CONFIG_HAVE_PERF_EVENTS_NMI=y CONFIG_BOOTPARAM_HARDLOCKUP_PANIC and CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is enabled using, echo 1 > /proc/sys/kernel/softlockup_panic echo 1 > /proc/sys/kernel/hardlockup_panic In addition to this the below parameters were set, echo 1 > /proc/sys/kernel/hung_task_panic echo 1 > /proc/sys/kernel/panic echo 1 > /proc/sys/kernel/panic_on_io_nmi echo 1 > /proc/sys/kernel/panic_on_oops echo 1 > /proc/sys/kernel/panic_on_unrecovered_nmi echo 1 > /proc/sys/kernel/unknown_nmi_panic echo 1 > /proc/sys/vm/panic_on_oom echo 1 > /proc/sys/kernel/sysrq Thanks. Regards, Vignesh ^ permalink raw reply [flat|nested] 5+ messages in thread
[parent not found: <20180822164235.GR1123@jcartwri.amer.corp.natinst.com>]
* Re: kernel freeze when running docker [not found] ` <20180822164235.GR1123@jcartwri.amer.corp.natinst.com> @ 2018-08-24 6:41 ` Vignesh Raman 2018-08-27 13:43 ` Vignesh Raman 0 siblings, 1 reply; 5+ messages in thread From: Vignesh Raman @ 2018-08-24 6:41 UTC (permalink / raw) To: Julia Cartwright; +Cc: linux-rt-users Hi Julia, On Wednesday 22 August 2018 10:12 PM, Julia Cartwright wrote: > I never explicitly asked...have you tried a similar test on a non-RT > kernel? Does it exhibit the same behavior? The linked moby github > issue would seem to indicate this isn't RT specific... Yes I tried the tests on non-RT kernel and the issue was not reproduced with 14 hours of testing with docker running kubernetes. I also tested with the test case mentioned in moby github issue and the kernel freeze was not seen. Config file is https://paste.debian.net/1039104/ root@debian:~# uname -a Linux debian 4.9.0-6-amd64 #1 SMP Debian 4.9.82-1+deb9u3 (2018-03-02) x86_64 GNU/Linux > Can you send your full config? If I get some time to try it, I'd like > to see if I can reproduce your issue. Thanks. Config file is https://paste.debian.net/1039103/ (output of 'grep "=[y|m]" config-4.14.59-rt37') This issue was reproduced with 4.9.0-6-rt-amd64 kernel also. root@debian:~# uname -a Linux debian 4.14.59-rt37 #1 SMP PREEMPT RT Fri Aug 24 00:31:29 UTC 2018 x86_64 GNU/Linux Below are the steps to reproduce the issue, Install docker using, curl https://raw.githubusercontent.com/rancher/install-docker/master/17.03.2.sh | /bin/bash Run "for f in $(seq 1 10000);do docker run -it --rm ubuntu echo $f; done" consecutively in 3 terminals. The freeze/lockup occurs sporadically. The other test case is to run docker with kubernetes. Regards, Vignesh ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: kernel freeze when running docker 2018-08-24 6:41 ` Vignesh Raman @ 2018-08-27 13:43 ` Vignesh Raman 0 siblings, 0 replies; 5+ messages in thread From: Vignesh Raman @ 2018-08-27 13:43 UTC (permalink / raw) To: Julia Cartwright; +Cc: linux-rt-users Hi Julia, I enabled Linux Kernel Dump Test Tool (CONFIG_LKDTM=y) and triggered a hard and soft lockup using, echo SOFTLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT echo HARDLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT With this the kernel panic/oops message is seen in serial console 2 out of 5 times (for both soft and hard lockup case). In /sys/fs/pstore the kernel panic messages are not captured at all. If I try with, echo PANIC > /sys/kernel/debug/provoke-crash/DIRECT the kernel panic/oops message is always dumped in serial console and in /sys/fs/pstore the crash logs are captured always. It looks like with rt kernel during soft/hard lockup, the dump messages are not captured sometimes. Not sure if this is related to the issue I'm seeing (kernel panic logs not seen in serial with docker test case). I have not performed these tests on non-rt kernel. Thanks. Regards, Vignesh On Friday 24 August 2018 12:11 PM, Vignesh Raman wrote: > Hi Julia, > > On Wednesday 22 August 2018 10:12 PM, Julia Cartwright wrote: >> I never explicitly asked...have you tried a similar test on a non-RT >> kernel? Does it exhibit the same behavior? The linked moby github >> issue would seem to indicate this isn't RT specific... > > Yes I tried the tests on non-RT kernel and the issue was not reproduced > with 14 hours of testing with docker running kubernetes. I also tested > with the test case mentioned in moby github issue and the kernel freeze > was not seen. Config file is https://paste.debian.net/1039104/ > > root@debian:~# uname -a > Linux debian 4.9.0-6-amd64 #1 SMP Debian 4.9.82-1+deb9u3 (2018-03-02) > x86_64 GNU/Linux > >> Can you send your full config? If I get some time to try it, I'd like >> to see if I can reproduce your issue. > > Thanks. Config file is https://paste.debian.net/1039103/ (output of > 'grep "=[y|m]" config-4.14.59-rt37') > This issue was reproduced with 4.9.0-6-rt-amd64 kernel also. > > root@debian:~# uname -a > Linux debian 4.14.59-rt37 #1 SMP PREEMPT RT Fri Aug 24 00:31:29 UTC 2018 > x86_64 GNU/Linux > > Below are the steps to reproduce the issue, > Install docker using, > curl > https://raw.githubusercontent.com/rancher/install-docker/master/17.03.2.sh > | /bin/bash > > Run > "for f in $(seq 1 10000);do docker run -it --rm ubuntu echo $f; done" > consecutively in 3 terminals. > > The freeze/lockup occurs sporadically. > > The other test case is to run docker with kubernetes. > > Regards, > Vignesh > ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2018-08-27 17:30 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-08-20 9:35 kernel freeze when running docker Vignesh Raman
[not found] ` <20180821012731.GB8562@jcartwri.amer.corp.natinst.com>
[not found] ` <CAH3OF50F26ywGHZ2Rfa7v-SeoZ6KccSPt+1uNq65eBNTqom=-A@mail.gmail.com>
2018-08-21 13:42 ` Julia Cartwright
2018-08-22 8:08 ` Vignesh Raman
[not found] ` <20180822164235.GR1123@jcartwri.amer.corp.natinst.com>
2018-08-24 6:41 ` Vignesh Raman
2018-08-27 13:43 ` Vignesh Raman
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).