linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* kernel freeze when running docker
@ 2018-08-20  9:35 Vignesh Raman
       [not found] ` <20180821012731.GB8562@jcartwri.amer.corp.natinst.com>
  0 siblings, 1 reply; 5+ messages in thread
From: Vignesh Raman @ 2018-08-20  9:35 UTC (permalink / raw)
  To: linux-rt-users

Hi,

I'm seeing a kernel freeze (and systemd reboots after few seconds)
when docker running kubernetes with debian 4.9 RT kernel
(4.9.0-6-rt-amd64).
This was tested with 4.14 -rt kernel and the issue was still seen.
kdump is not capturing the coredump in this case so I'm unable to
check the crash logs.
The system crash logs are not available in /sys/fs/pstore.

When manual crash is triggered with sysrq the kdump captures the crash
logs. But this does not work always.
Sometimes the system does not reboot when manual crash (sysrq) is
triggered and a freeze happens.
Maybe the capture kernel itself hangs while taking coredump.

When manual crash is triggered with sysrq the crash logs are available
in /sys/fs/pstore.

I'm able to reproduce a similar freeze/reset issue with the steps
mentioned here https://github.com/moby/moby/issues/19758.
Run "for f in $(seq 1 1000);do docker run -it --rm ubuntu echo $f;
done" concurrently in 3 terminals.
But I'm unable to confirm if both these issues are related due to
unavailability of crash logs.

Enabled CPUAccounting/MemoryAccounting in docker.service but the
system freeze was still seen.,
sudo systemctl set-property docker.service MemoryAccounting=yes
sudo systemctl set-property docker.service CPUAccounting=yes

I was thinking to enable CONFIG_RT_GROUP_SCHED (allocate real CPU
bandwidth to task groups). But this is disabled if PREEMPT_RT_FULL is
enabled.

I'm unable to root cause the reason for freeze/crash because crash
logs are not available.
Is there any other way to capture the crashdump? Any pointers will be
helpful. Thanks.

Regards,
Vignesh

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: kernel freeze when running docker
       [not found]   ` <CAH3OF50F26ywGHZ2Rfa7v-SeoZ6KccSPt+1uNq65eBNTqom=-A@mail.gmail.com>
@ 2018-08-21 13:42     ` Julia Cartwright
  2018-08-22  8:08       ` Vignesh Raman
  0 siblings, 1 reply; 5+ messages in thread
From: Julia Cartwright @ 2018-08-21 13:42 UTC (permalink / raw)
  To: Vignesh Raman; +Cc: linux-rt-users

Hello Vignesh-

On Tue, Aug 21, 2018 at 07:03:20PM +0530, Vignesh Raman wrote:
> Hi Julia,
> 
> > On Tue, Aug 21, 2018 at 6:57 AM Julia Cartwright <julia@ni.com> wrote:
> > Serial console is really the best way to go, especially if the higher
> > level tools (crash, pstore, etc.) aren't functioning properly.  With the
> > hopes that you can get an oops or panic message...
> 
> Added 'debug systemd.journald.forward_to_console=1 pause_on_oops=120' to
> kernel boot parameters. When the freeze/reset happens with the docker test
> case, the kernel oops/panic is not seen in the serial console.

That's annoying.  When you are in this state can you trigger a dump via
the magic sysrq sequence over serial?

Have you tried enabling the various lockup detectors?  There is a
software-driven one, and an NMI watchdog based one.  Enabling both
at the same time should be fine.

   Julia

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: kernel freeze when running docker
  2018-08-21 13:42     ` Julia Cartwright
@ 2018-08-22  8:08       ` Vignesh Raman
       [not found]         ` <20180822164235.GR1123@jcartwri.amer.corp.natinst.com>
  0 siblings, 1 reply; 5+ messages in thread
From: Vignesh Raman @ 2018-08-22  8:08 UTC (permalink / raw)
  To: Julia Cartwright; +Cc: linux-rt-users

Hi Julia,

On Tuesday 21 August 2018 07:12 PM, Julia Cartwright wrote:
> That's annoying.  When you are in this state can you trigger a dump via
> the magic sysrq sequence over serial?

No. I'm not able to trigger a crash using sysrq (Alt+SysRq+c). Keyboard
'caps lock' led light does not respond when caps lock key is pressed.

Does it indicate a softlockup or hardlockup ?

> Have you tried enabling the various lockup detectors?  There is a
> software-driven one, and an NMI watchdog based one.  Enabling both
> at the same time should be fine.

The below config options are enabled,
CONFIG_HAVE_HARDLOCKUP_DETECTOR_PERF=y
# Debug Lockups and Hangs
CONFIG_LOCKUP_DETECTOR=y
CONFIG_SOFTLOCKUP_DETECTOR=y
CONFIG_HARDLOCKUP_DETECTOR_PERF=y
CONFIG_HARDLOCKUP_CHECK_TIMESTAMP=y
CONFIG_HARDLOCKUP_DETECTOR=y
# CONFIG_BOOTPARAM_HARDLOCKUP_PANIC is not set
CONFIG_BOOTPARAM_HARDLOCKUP_PANIC_VALUE=0
# CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set
CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=0
CONFIG_HAVE_NMI=y
CONFIG_HAVE_PERF_EVENTS_NMI=y

CONFIG_BOOTPARAM_HARDLOCKUP_PANIC and CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC
is enabled using,
echo 1 > /proc/sys/kernel/softlockup_panic
echo 1 > /proc/sys/kernel/hardlockup_panic

In addition to this the below parameters were set,
echo 1 > /proc/sys/kernel/hung_task_panic
echo 1 > /proc/sys/kernel/panic
echo 1 > /proc/sys/kernel/panic_on_io_nmi
echo 1 > /proc/sys/kernel/panic_on_oops
echo 1 > /proc/sys/kernel/panic_on_unrecovered_nmi
echo 1 > /proc/sys/kernel/unknown_nmi_panic
echo 1 > /proc/sys/vm/panic_on_oom
echo 1 > /proc/sys/kernel/sysrq

Thanks.

Regards,
Vignesh

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: kernel freeze when running docker
       [not found]         ` <20180822164235.GR1123@jcartwri.amer.corp.natinst.com>
@ 2018-08-24  6:41           ` Vignesh Raman
  2018-08-27 13:43             ` Vignesh Raman
  0 siblings, 1 reply; 5+ messages in thread
From: Vignesh Raman @ 2018-08-24  6:41 UTC (permalink / raw)
  To: Julia Cartwright; +Cc: linux-rt-users

Hi Julia,

On Wednesday 22 August 2018 10:12 PM, Julia Cartwright wrote:
> I never explicitly asked...have you tried a similar test on a non-RT
> kernel?  Does it exhibit the same behavior?  The linked moby github
> issue would seem to indicate this isn't RT specific...

Yes I tried the tests on non-RT kernel and the issue was not reproduced
with 14 hours of testing with docker running kubernetes. I also tested
with the test case mentioned in moby github issue and the kernel freeze
was not seen. Config file is https://paste.debian.net/1039104/

root@debian:~# uname -a
Linux debian 4.9.0-6-amd64 #1 SMP Debian 4.9.82-1+deb9u3 (2018-03-02)
x86_64 GNU/Linux

> Can you send your full config?  If I get some time to try it, I'd like
> to see if I can reproduce your issue.

Thanks. Config file is https://paste.debian.net/1039103/ (output of
'grep "=[y|m]" config-4.14.59-rt37')
This issue was reproduced with 4.9.0-6-rt-amd64 kernel also.

root@debian:~# uname -a
Linux debian 4.14.59-rt37 #1 SMP PREEMPT RT Fri Aug 24 00:31:29 UTC 2018
x86_64 GNU/Linux

Below are the steps to reproduce the issue,
Install docker using,
curl
https://raw.githubusercontent.com/rancher/install-docker/master/17.03.2.sh
| /bin/bash

Run
"for f in $(seq 1 10000);do docker run -it --rm ubuntu echo $f; done"
consecutively in 3 terminals.

The freeze/lockup occurs sporadically.

The other test case is to run docker with kubernetes.

Regards,
Vignesh

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: kernel freeze when running docker
  2018-08-24  6:41           ` Vignesh Raman
@ 2018-08-27 13:43             ` Vignesh Raman
  0 siblings, 0 replies; 5+ messages in thread
From: Vignesh Raman @ 2018-08-27 13:43 UTC (permalink / raw)
  To: Julia Cartwright; +Cc: linux-rt-users

Hi Julia,

I enabled Linux Kernel Dump Test Tool (CONFIG_LKDTM=y) and triggered a
hard and soft lockup using,
echo SOFTLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT
echo HARDLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT

With this the kernel panic/oops message is seen in serial console 2 out
of 5 times (for both soft and hard lockup case). In /sys/fs/pstore the
kernel panic messages are not captured at all.

If I try with,
echo PANIC > /sys/kernel/debug/provoke-crash/DIRECT

the kernel panic/oops message is always dumped in serial console and in
/sys/fs/pstore the crash logs are captured always.

It looks like with rt kernel during soft/hard lockup, the dump messages
are not captured sometimes. Not sure if this is related to the issue I'm
seeing (kernel panic logs not seen in serial with docker test case).

I have not performed these tests on non-rt kernel.

Thanks.

Regards,
Vignesh

On Friday 24 August 2018 12:11 PM, Vignesh Raman wrote:
> Hi Julia,
> 
> On Wednesday 22 August 2018 10:12 PM, Julia Cartwright wrote:
>> I never explicitly asked...have you tried a similar test on a non-RT
>> kernel?  Does it exhibit the same behavior?  The linked moby github
>> issue would seem to indicate this isn't RT specific...
> 
> Yes I tried the tests on non-RT kernel and the issue was not reproduced
> with 14 hours of testing with docker running kubernetes. I also tested
> with the test case mentioned in moby github issue and the kernel freeze
> was not seen. Config file is https://paste.debian.net/1039104/
> 
> root@debian:~# uname -a
> Linux debian 4.9.0-6-amd64 #1 SMP Debian 4.9.82-1+deb9u3 (2018-03-02)
> x86_64 GNU/Linux
> 
>> Can you send your full config?  If I get some time to try it, I'd like
>> to see if I can reproduce your issue.
> 
> Thanks. Config file is https://paste.debian.net/1039103/ (output of
> 'grep "=[y|m]" config-4.14.59-rt37')
> This issue was reproduced with 4.9.0-6-rt-amd64 kernel also.
> 
> root@debian:~# uname -a
> Linux debian 4.14.59-rt37 #1 SMP PREEMPT RT Fri Aug 24 00:31:29 UTC 2018
> x86_64 GNU/Linux
> 
> Below are the steps to reproduce the issue,
> Install docker using,
> curl
> https://raw.githubusercontent.com/rancher/install-docker/master/17.03.2.sh
> | /bin/bash
> 
> Run
> "for f in $(seq 1 10000);do docker run -it --rm ubuntu echo $f; done"
> consecutively in 3 terminals.
> 
> The freeze/lockup occurs sporadically.
> 
> The other test case is to run docker with kubernetes.
> 
> Regards,
> Vignesh
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2018-08-27 17:30 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-08-20  9:35 kernel freeze when running docker Vignesh Raman
     [not found] ` <20180821012731.GB8562@jcartwri.amer.corp.natinst.com>
     [not found]   ` <CAH3OF50F26ywGHZ2Rfa7v-SeoZ6KccSPt+1uNq65eBNTqom=-A@mail.gmail.com>
2018-08-21 13:42     ` Julia Cartwright
2018-08-22  8:08       ` Vignesh Raman
     [not found]         ` <20180822164235.GR1123@jcartwri.amer.corp.natinst.com>
2018-08-24  6:41           ` Vignesh Raman
2018-08-27 13:43             ` Vignesh Raman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).