From: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
To: lkp@lists.01.org
Subject: Re: [printk] 34578dc67f: EIP is at vprintk_emit+0x1ea/0x600
Date: Thu, 25 Feb 2016 14:10:05 +0900 [thread overview]
Message-ID: <20160225051005.GA884@swordfish> (raw)
In-Reply-To: <20160224151242.GA2254@swordfish>
[-- Attachment #1: Type: text/plain, Size: 7560 bytes --]
Hello,
On (02/25/16 00:12), Sergey Senozhatsky wrote:
[..]
> > > I want to be sure that the patch in printk() did not introduce
> > > a deadlock that is visible only under a high printk load.
>
> I'll do more tests, certainly.
No behavioral change between linux-next and linux-next modulo printk patches was observed.
next-20160224
git revert 8a30bf8ff5e07a589ad43363b06ea361ad723a0e 2822460a0ea97656ae71b2b4ddac087f79c5876a b6751672d3ac1d3d876fd5ea7ec579956eee4d6b
set PREEMPT_NONE
CONFIG_PREEMPT_NONE=y
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT is not set
CONFIG_PREEMPT_COUNT=y
add PANIC of softlockup and panic on hung_task
CONFIG_LOCKUP_DETECTOR=y
CONFIG_HARDLOCKUP_DETECTOR=y
CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
CONFIG_BOOTPARAM_HARDLOCKUP_PANIC_VALUE=1
CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y <<
CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=1
CONFIG_DETECT_HUNG_TASK=y
CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=11
CONFIG_BOOTPARAM_HUNG_TASK_PANIC=y <<
CONFIG_BOOTPARAM_HUNG_TASK_PANIC_VALUE=1
CONFIG_WQ_WATCHDOG=y
CONFIG_PANIC_ON_OOPS=y
CONFIG_PANIC_ON_OOPS_VALUE=1
CONFIG_PANIC_TIMEOUT=23
modify watchdog timeout (so it detects lockup sooner) and add
printk() flood -- the code does not even disable preemption,
just busy looping, we have PREEMPT_NONE kernel.
git diff
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 46055db..8759dbb 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1410,6 +1410,21 @@ static int __init zram_init(void)
{
int ret;
+ /*unsigned long flags;*/
+ u64 s, e;
+
+ /*preempt_disable();*/
+ s = local_clock() >> 31UL;
+ while (1) {
+ e = local_clock() >> 31UL;
+ if (e - s > 10)
+ goto out;
+ pr_err(">>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>\n");
+ }
+out:
+ /*preempt_enable();*/
+
+
ret = class_register(&zram_control_class);
if (ret) {
pr_err("Unable to register zram-control class\n");
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index b3ace6e..8655364 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -54,7 +54,7 @@ static unsigned long __read_mostly watchdog_enabled = SOFT_WATCHDOG_ENABLED;
int __read_mostly nmi_watchdog_enabled;
int __read_mostly soft_watchdog_enabled;
int __read_mostly watchdog_user_enabled;
-int __read_mostly watchdog_thresh = 10;
+int __read_mostly watchdog_thresh = 1;
#ifdef CONFIG_SMP
int __read_mostly sysctl_softlockup_all_cpu_backtrace;
---
The test (unpatched linux-next)
NMI watchdog: BUG: soft lockup - CPU#2 stuck for 3s! [modprobe:418]
Modules linked in: ...
irq event stamp: 40679
hardirqs last enabled at (40679): [<ffffffff8108b771>] vprintk_emit+0x3f0/0x499
hardirqs last disabled at (40678): [<ffffffff8108b3d4>] vprintk_emit+0x53/0x499
softirqs last enabled at (35360): [<ffffffff810475e9>] __do_softirq+0x2d8
softirqs last disabled at (35357): [<ffffffff810478ce>] irq_exit+0x41/0x95
CPU: 2 PID: 418 Comm: modprobe Tainted: G O 4.5.0-rc5-next-20160224-dbg-00012-gfedb791-dirty #187
Hardware name: ...
task: ffff88013290ce00 ti: ffff880131f98000 task.ti: ffff880131f98000
RIP: 0010:[<ffffffff8108b2d8>] [<ffffffff8108b2d8>] console_unlock+0x3ad/0x456
RSP: 0018:ffff880131f9bbc8 EFLAGS: 00000246
RAX: ffff88013290ce00 RBX: ffffffff82891fdc RCX: 00000000ffffffff
RDX: ffffffff812c99a8 RSI: 0000000000000001 RDI: ffffffff8108b2d5
RBP: ffff880131f9bc08 R08: 0000000000000000 R09: 0000000000000000
R10: 00000000ffff00ff R11: 000000000000042b R12: 0000000000000000
R13: 0000000000000246 R14: 0000000000000003 R15: 0000000000000001
S: 00007fe6ed4fd700(0000) GS:ffff880137d00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe6eb905000 CR3: 00000000bb7fc000 CR4: 00000000000006e0
Stack:
0000000000000000 0000000000000029 000000000001a128 00000000ffffffff
0000000000000003 0000000000000028 ffffffff82877602 0000000000000028
ffff880131f9bc60 ffffffff8108b7cb 0000000000000000 0000000000000000
Call Trace:
[<ffffffff8108b7cb>] vprintk_emit+0x44a/0x499
[<ffffffffa01f0000>] ? 0xffffffffa01f0000
[<ffffffff8108b931>] vprintk_default+0x1d/0x1f
[<ffffffff810f1b21>] printk+0x48/0x50
[<ffffffffa01f0032>] zram_init+0x32/0x1000 [zram]
[<ffffffff81000402>] do_one_initcall+0xe7/0x16
[<ffffffff810f1d5f>] ? do_init_module+0x27/0x1d9
[<ffffffff81093619>] ? rcu_read_lock_sched_held+0x61/0x69
[<ffffffff811330fa>] ? kmem_cache_alloc_trace+0x138/0x1d0
[<ffffffff810f1d5f>] ? do_init_module+0x27/0x1d9
[<ffffffff810f1d97>] do_init_module+0x5f/0x1d9
[<ffffffff810b2b2c>] load_module+0x19da/0x1eb8
[<ffffffff810830c4>] ? lock_acquire+0x10f/0x1a3
[<ffffffff810b3109>] SyS_init_module+0xff/0x125
[<ffffffff810b3109>] ? SyS_init_module+0xff/0x125
[<ffffffff8150e525>] entry_SYSCALL_64_fastpath+0x18/0xa8
Code: 89 05 11 cc 80 01 4c 89 35 0e cc 80 01 88 d0 83 e2 fe 83 e0 1e 88 53 0f 0f b6 c0 89 05 f2 cb 80 01 eb 9e e8 9e 5a ff ff 41 55 9d <e9> ed fe ff ff 48 83 3d 3b cd 80 01 00 c7 05 3d cd 80 01 00 00
Kernel panic - not syncing: softlockup: hung tasks
CPU: 2 PID: 418 Comm: modprobe Tainted: G O L 4.5.0-rc5-next-20160224-dbg-00012-gfedb791-dirty #187
Hardware name: ...
0000000000000000 ffff880137d03d68 ffffffff8123334d ffffffff817ad7c7
ffff880137d03df8 ffff880137d03de8 ffffffff810f124e ffffffff00000008
ffff880137d03df8 ffff880137d03d90 0000000000000000 00000000ffffffff
Call Trace:
<IRQ> [<ffffffff8123334d>] dump_stack+0x67/0x90
[<ffffffff810f124e>] panic+0xd1/0x212
[<ffffffff8107d1c1>] ? down_trylock+0x2d/0x37
[<ffffffff810c12c5>] watchdog_timer_fn+0x18e/0x1b0
[<ffffffff8109cee9>] __hrtimer_run_queues+0x1ee/0x418
[<ffffffff810c1137>] ? __touch_watchdog+0x1c/0x1c
[<ffffffff8109d2af>] hrtimer_interrupt+0xa8/0x192
[<ffffffff810354cd>] hpet_interrupt_handler+0x2a/0x31
[<ffffffff8108ccd8>] handle_irq_event_percpu+0xd2/0x2e4
[<ffffffff8108cf22>] handle_irq_event+0x38/0x56
[<ffffffff8108fc63>] handle_edge_irq+0xd9/0xfb
[<ffffffff810174ad>] handle_irq+0x1c/0x20
[<ffffffff81510795>] do_IRQ+0x85/0x1
[<ffffffff8150ee86>] common_interrupt+0x86/0x86
<EOI> [<ffffffff812c99a8>] ? vt_console_print+0x2f9/0x308
[<ffffffff8108b2d5>] ? console_unlock+0x3aa/0x456
[<ffffffff8108b2d8>] ? console_unlock+0x3ad/0x456
[<ffffffff8108b7cb>] vprintk_emit+0x44a/0x499
[<ffffffffa01f0000>] ? 0xffffffffa01f0000
[<ffffffff8108b931>] vprintk_default+0x1d/0x1f
[<ffffffff810f1b21>] printk+0x48/0x50
[<ffffffffa01f0032>] zram_init+0x32/0x1000 [zram]
[<ffffffff81000402>] do_one_initcall+0xe7/0x16c
[<ffffffff810f1d5f>] ? do_init_module+0x27/0x1d9
[<ffffffff81093619>] ? rcu_read_lock_sched_held+0x61/0x69
[<ffffffff811330fa>] ? kmem_cache_alloc_trace+0x138/0x1d0
[<ffffffff810f1d5f>] ? do_init_module+0x27/0x1d9
[<ffffffff810f1d97>] do_init_module+0x5f/0x1d9
[<ffffffff810b2b2c>] load_module+0x19da/0x1eb8
[<ffffffff810830c4>] ? lock_acquire+0x10f/0x1a3
[<ffffffff810b3109>] SyS_init_module+0xff/0x125
[<ffffffff810b3109>] ? SyS_init_module+0xff/0x125
[<ffffffff8150e525>] entry_SYSCALL_64_fastpath+0x18/0xa8
so in this test printk()->console_unlock() was performed on the same CPU.
but in general those can be different CPUs. one busy spinning doing printk(),
the other busy spinning doing console_unlock()->call_console_drivers().
-ss
WARNING: multiple messages have this Message-ID (diff)
From: Sergey Senozhatsky <sergey.senozhatsky.work-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>
To: Petr Mladek <pmladek-IBi9RG/b67k@public.gmane.org>
Cc: Sergey Senozhatsky
<sergey.senozhatsky-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
Sergey Senozhatsky
<sergey.senozhatsky.work-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
kernel test robot
<ying.huang-VuQAYsv1563Yd54FQh9/CA@public.gmane.org>,
lkp-JC7UmRfGjtg@public.gmane.org,
Rob Herring <robh+dt-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
Frank Rowand
<frowand.list-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org>,
Grant Likely
<grant.likely-QSEj5FYQhm4dnm+yROfE0A@public.gmane.org>,
devicetree-u79uwXL29TY76Z2rM5mHXA@public.gmane.org,
LKML <linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
Andrew Morton
<akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>,
Calvin Owens <calvinowens-b10kYP2dOMg@public.gmane.org>,
Dave Jones
<davej-rdkfGonbjUTCLXcRTR1eJlpr/1R2p/CL@public.gmane.org>,
Kyle McMartin <kyle-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>,
Jan Kara <jack-IBi9RG/b67k@public.gmane.org>,
Stephen Rothwell <sfr-3FnU+UHB4dNDw9hX6IcOSA@public.gmane.org>
Subject: Re: [lkp] [printk] 34578dc67f: EIP is at vprintk_emit+0x1ea/0x600
Date: Thu, 25 Feb 2016 14:10:05 +0900 [thread overview]
Message-ID: <20160225051005.GA884@swordfish> (raw)
In-Reply-To: <20160224151242.GA2254@swordfish>
Hello,
On (02/25/16 00:12), Sergey Senozhatsky wrote:
[..]
> > > I want to be sure that the patch in printk() did not introduce
> > > a deadlock that is visible only under a high printk load.
>
> I'll do more tests, certainly.
No behavioral change between linux-next and linux-next modulo printk patches was observed.
next-20160224
git revert 8a30bf8ff5e07a589ad43363b06ea361ad723a0e 2822460a0ea97656ae71b2b4ddac087f79c5876a b6751672d3ac1d3d876fd5ea7ec579956eee4d6b
set PREEMPT_NONE
CONFIG_PREEMPT_NONE=y
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT is not set
CONFIG_PREEMPT_COUNT=y
add PANIC of softlockup and panic on hung_task
CONFIG_LOCKUP_DETECTOR=y
CONFIG_HARDLOCKUP_DETECTOR=y
CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
CONFIG_BOOTPARAM_HARDLOCKUP_PANIC_VALUE=1
CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y <<
CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=1
CONFIG_DETECT_HUNG_TASK=y
CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=11
CONFIG_BOOTPARAM_HUNG_TASK_PANIC=y <<
CONFIG_BOOTPARAM_HUNG_TASK_PANIC_VALUE=1
CONFIG_WQ_WATCHDOG=y
CONFIG_PANIC_ON_OOPS=y
CONFIG_PANIC_ON_OOPS_VALUE=1
CONFIG_PANIC_TIMEOUT=23
modify watchdog timeout (so it detects lockup sooner) and add
printk() flood -- the code does not even disable preemption,
just busy looping, we have PREEMPT_NONE kernel.
git diff
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 46055db..8759dbb 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1410,6 +1410,21 @@ static int __init zram_init(void)
{
int ret;
+ /*unsigned long flags;*/
+ u64 s, e;
+
+ /*preempt_disable();*/
+ s = local_clock() >> 31UL;
+ while (1) {
+ e = local_clock() >> 31UL;
+ if (e - s > 10)
+ goto out;
+ pr_err(">>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>\n");
+ }
+out:
+ /*preempt_enable();*/
+
+
ret = class_register(&zram_control_class);
if (ret) {
pr_err("Unable to register zram-control class\n");
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index b3ace6e..8655364 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -54,7 +54,7 @@ static unsigned long __read_mostly watchdog_enabled = SOFT_WATCHDOG_ENABLED;
int __read_mostly nmi_watchdog_enabled;
int __read_mostly soft_watchdog_enabled;
int __read_mostly watchdog_user_enabled;
-int __read_mostly watchdog_thresh = 10;
+int __read_mostly watchdog_thresh = 1;
#ifdef CONFIG_SMP
int __read_mostly sysctl_softlockup_all_cpu_backtrace;
---
The test (unpatched linux-next)
NMI watchdog: BUG: soft lockup - CPU#2 stuck for 3s! [modprobe:418]
Modules linked in: ...
irq event stamp: 40679
hardirqs last enabled at (40679): [<ffffffff8108b771>] vprintk_emit+0x3f0/0x499
hardirqs last disabled at (40678): [<ffffffff8108b3d4>] vprintk_emit+0x53/0x499
softirqs last enabled at (35360): [<ffffffff810475e9>] __do_softirq+0x2d8
softirqs last disabled at (35357): [<ffffffff810478ce>] irq_exit+0x41/0x95
CPU: 2 PID: 418 Comm: modprobe Tainted: G O 4.5.0-rc5-next-20160224-dbg-00012-gfedb791-dirty #187
Hardware name: ...
task: ffff88013290ce00 ti: ffff880131f98000 task.ti: ffff880131f98000
RIP: 0010:[<ffffffff8108b2d8>] [<ffffffff8108b2d8>] console_unlock+0x3ad/0x456
RSP: 0018:ffff880131f9bbc8 EFLAGS: 00000246
RAX: ffff88013290ce00 RBX: ffffffff82891fdc RCX: 00000000ffffffff
RDX: ffffffff812c99a8 RSI: 0000000000000001 RDI: ffffffff8108b2d5
RBP: ffff880131f9bc08 R08: 0000000000000000 R09: 0000000000000000
R10: 00000000ffff00ff R11: 000000000000042b R12: 0000000000000000
R13: 0000000000000246 R14: 0000000000000003 R15: 0000000000000001
S: 00007fe6ed4fd700(0000) GS:ffff880137d00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe6eb905000 CR3: 00000000bb7fc000 CR4: 00000000000006e0
Stack:
0000000000000000 0000000000000029 000000000001a128 00000000ffffffff
0000000000000003 0000000000000028 ffffffff82877602 0000000000000028
ffff880131f9bc60 ffffffff8108b7cb 0000000000000000 0000000000000000
Call Trace:
[<ffffffff8108b7cb>] vprintk_emit+0x44a/0x499
[<ffffffffa01f0000>] ? 0xffffffffa01f0000
[<ffffffff8108b931>] vprintk_default+0x1d/0x1f
[<ffffffff810f1b21>] printk+0x48/0x50
[<ffffffffa01f0032>] zram_init+0x32/0x1000 [zram]
[<ffffffff81000402>] do_one_initcall+0xe7/0x16
[<ffffffff810f1d5f>] ? do_init_module+0x27/0x1d9
[<ffffffff81093619>] ? rcu_read_lock_sched_held+0x61/0x69
[<ffffffff811330fa>] ? kmem_cache_alloc_trace+0x138/0x1d0
[<ffffffff810f1d5f>] ? do_init_module+0x27/0x1d9
[<ffffffff810f1d97>] do_init_module+0x5f/0x1d9
[<ffffffff810b2b2c>] load_module+0x19da/0x1eb8
[<ffffffff810830c4>] ? lock_acquire+0x10f/0x1a3
[<ffffffff810b3109>] SyS_init_module+0xff/0x125
[<ffffffff810b3109>] ? SyS_init_module+0xff/0x125
[<ffffffff8150e525>] entry_SYSCALL_64_fastpath+0x18/0xa8
Code: 89 05 11 cc 80 01 4c 89 35 0e cc 80 01 88 d0 83 e2 fe 83 e0 1e 88 53 0f 0f b6 c0 89 05 f2 cb 80 01 eb 9e e8 9e 5a ff ff 41 55 9d <e9> ed fe ff ff 48 83 3d 3b cd 80 01 00 c7 05 3d cd 80 01 00 00
Kernel panic - not syncing: softlockup: hung tasks
CPU: 2 PID: 418 Comm: modprobe Tainted: G O L 4.5.0-rc5-next-20160224-dbg-00012-gfedb791-dirty #187
Hardware name: ...
0000000000000000 ffff880137d03d68 ffffffff8123334d ffffffff817ad7c7
ffff880137d03df8 ffff880137d03de8 ffffffff810f124e ffffffff00000008
ffff880137d03df8 ffff880137d03d90 0000000000000000 00000000ffffffff
Call Trace:
<IRQ> [<ffffffff8123334d>] dump_stack+0x67/0x90
[<ffffffff810f124e>] panic+0xd1/0x212
[<ffffffff8107d1c1>] ? down_trylock+0x2d/0x37
[<ffffffff810c12c5>] watchdog_timer_fn+0x18e/0x1b0
[<ffffffff8109cee9>] __hrtimer_run_queues+0x1ee/0x418
[<ffffffff810c1137>] ? __touch_watchdog+0x1c/0x1c
[<ffffffff8109d2af>] hrtimer_interrupt+0xa8/0x192
[<ffffffff810354cd>] hpet_interrupt_handler+0x2a/0x31
[<ffffffff8108ccd8>] handle_irq_event_percpu+0xd2/0x2e4
[<ffffffff8108cf22>] handle_irq_event+0x38/0x56
[<ffffffff8108fc63>] handle_edge_irq+0xd9/0xfb
[<ffffffff810174ad>] handle_irq+0x1c/0x20
[<ffffffff81510795>] do_IRQ+0x85/0x1
[<ffffffff8150ee86>] common_interrupt+0x86/0x86
<EOI> [<ffffffff812c99a8>] ? vt_console_print+0x2f9/0x308
[<ffffffff8108b2d5>] ? console_unlock+0x3aa/0x456
[<ffffffff8108b2d8>] ? console_unlock+0x3ad/0x456
[<ffffffff8108b7cb>] vprintk_emit+0x44a/0x499
[<ffffffffa01f0000>] ? 0xffffffffa01f0000
[<ffffffff8108b931>] vprintk_default+0x1d/0x1f
[<ffffffff810f1b21>] printk+0x48/0x50
[<ffffffffa01f0032>] zram_init+0x32/0x1000 [zram]
[<ffffffff81000402>] do_one_initcall+0xe7/0x16c
[<ffffffff810f1d5f>] ? do_init_module+0x27/0x1d9
[<ffffffff81093619>] ? rcu_read_lock_sched_held+0x61/0x69
[<ffffffff811330fa>] ? kmem_cache_alloc_trace+0x138/0x1d0
[<ffffffff810f1d5f>] ? do_init_module+0x27/0x1d9
[<ffffffff810f1d97>] do_init_module+0x5f/0x1d9
[<ffffffff810b2b2c>] load_module+0x19da/0x1eb8
[<ffffffff810830c4>] ? lock_acquire+0x10f/0x1a3
[<ffffffff810b3109>] SyS_init_module+0xff/0x125
[<ffffffff810b3109>] ? SyS_init_module+0xff/0x125
[<ffffffff8150e525>] entry_SYSCALL_64_fastpath+0x18/0xa8
so in this test printk()->console_unlock() was performed on the same CPU.
but in general those can be different CPUs. one busy spinning doing printk(),
the other busy spinning doing console_unlock()->call_console_drivers().
-ss
--
To unsubscribe from this list: send the line "unsubscribe devicetree" in
the body of a message to majordomo-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
WARNING: multiple messages have this Message-ID (diff)
From: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
To: Petr Mladek <pmladek@suse.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>,
kernel test robot <ying.huang@linux.intel.com>,
lkp@01.org, Rob Herring <robh+dt@kernel.org>,
Frank Rowand <frowand.list@gmail.com>,
Grant Likely <grant.likely@linaro.org>,
devicetree@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
Calvin Owens <calvinowens@fb.com>,
Dave Jones <davej@codemonkey.org.uk>,
Kyle McMartin <kyle@kernel.org>, Tejun Heo <tj@kernel.org>,
Jan Kara <jack@suse.com>, Stephen Rothwell <sfr@canb.auug.org.au>
Subject: Re: [lkp] [printk] 34578dc67f: EIP is at vprintk_emit+0x1ea/0x600
Date: Thu, 25 Feb 2016 14:10:05 +0900 [thread overview]
Message-ID: <20160225051005.GA884@swordfish> (raw)
In-Reply-To: <20160224151242.GA2254@swordfish>
Hello,
On (02/25/16 00:12), Sergey Senozhatsky wrote:
[..]
> > > I want to be sure that the patch in printk() did not introduce
> > > a deadlock that is visible only under a high printk load.
>
> I'll do more tests, certainly.
No behavioral change between linux-next and linux-next modulo printk patches was observed.
next-20160224
git revert 8a30bf8ff5e07a589ad43363b06ea361ad723a0e 2822460a0ea97656ae71b2b4ddac087f79c5876a b6751672d3ac1d3d876fd5ea7ec579956eee4d6b
set PREEMPT_NONE
CONFIG_PREEMPT_NONE=y
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT is not set
CONFIG_PREEMPT_COUNT=y
add PANIC of softlockup and panic on hung_task
CONFIG_LOCKUP_DETECTOR=y
CONFIG_HARDLOCKUP_DETECTOR=y
CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
CONFIG_BOOTPARAM_HARDLOCKUP_PANIC_VALUE=1
CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y <<
CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=1
CONFIG_DETECT_HUNG_TASK=y
CONFIG_DEFAULT_HUNG_TASK_TIMEOUT=11
CONFIG_BOOTPARAM_HUNG_TASK_PANIC=y <<
CONFIG_BOOTPARAM_HUNG_TASK_PANIC_VALUE=1
CONFIG_WQ_WATCHDOG=y
CONFIG_PANIC_ON_OOPS=y
CONFIG_PANIC_ON_OOPS_VALUE=1
CONFIG_PANIC_TIMEOUT=23
modify watchdog timeout (so it detects lockup sooner) and add
printk() flood -- the code does not even disable preemption,
just busy looping, we have PREEMPT_NONE kernel.
git diff
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index 46055db..8759dbb 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1410,6 +1410,21 @@ static int __init zram_init(void)
{
int ret;
+ /*unsigned long flags;*/
+ u64 s, e;
+
+ /*preempt_disable();*/
+ s = local_clock() >> 31UL;
+ while (1) {
+ e = local_clock() >> 31UL;
+ if (e - s > 10)
+ goto out;
+ pr_err(">>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>\n");
+ }
+out:
+ /*preempt_enable();*/
+
+
ret = class_register(&zram_control_class);
if (ret) {
pr_err("Unable to register zram-control class\n");
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index b3ace6e..8655364 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -54,7 +54,7 @@ static unsigned long __read_mostly watchdog_enabled = SOFT_WATCHDOG_ENABLED;
int __read_mostly nmi_watchdog_enabled;
int __read_mostly soft_watchdog_enabled;
int __read_mostly watchdog_user_enabled;
-int __read_mostly watchdog_thresh = 10;
+int __read_mostly watchdog_thresh = 1;
#ifdef CONFIG_SMP
int __read_mostly sysctl_softlockup_all_cpu_backtrace;
---
The test (unpatched linux-next)
NMI watchdog: BUG: soft lockup - CPU#2 stuck for 3s! [modprobe:418]
Modules linked in: ...
irq event stamp: 40679
hardirqs last enabled at (40679): [<ffffffff8108b771>] vprintk_emit+0x3f0/0x499
hardirqs last disabled at (40678): [<ffffffff8108b3d4>] vprintk_emit+0x53/0x499
softirqs last enabled at (35360): [<ffffffff810475e9>] __do_softirq+0x2d8
softirqs last disabled at (35357): [<ffffffff810478ce>] irq_exit+0x41/0x95
CPU: 2 PID: 418 Comm: modprobe Tainted: G O 4.5.0-rc5-next-20160224-dbg-00012-gfedb791-dirty #187
Hardware name: ...
task: ffff88013290ce00 ti: ffff880131f98000 task.ti: ffff880131f98000
RIP: 0010:[<ffffffff8108b2d8>] [<ffffffff8108b2d8>] console_unlock+0x3ad/0x456
RSP: 0018:ffff880131f9bbc8 EFLAGS: 00000246
RAX: ffff88013290ce00 RBX: ffffffff82891fdc RCX: 00000000ffffffff
RDX: ffffffff812c99a8 RSI: 0000000000000001 RDI: ffffffff8108b2d5
RBP: ffff880131f9bc08 R08: 0000000000000000 R09: 0000000000000000
R10: 00000000ffff00ff R11: 000000000000042b R12: 0000000000000000
R13: 0000000000000246 R14: 0000000000000003 R15: 0000000000000001
S: 00007fe6ed4fd700(0000) GS:ffff880137d00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fe6eb905000 CR3: 00000000bb7fc000 CR4: 00000000000006e0
Stack:
0000000000000000 0000000000000029 000000000001a128 00000000ffffffff
0000000000000003 0000000000000028 ffffffff82877602 0000000000000028
ffff880131f9bc60 ffffffff8108b7cb 0000000000000000 0000000000000000
Call Trace:
[<ffffffff8108b7cb>] vprintk_emit+0x44a/0x499
[<ffffffffa01f0000>] ? 0xffffffffa01f0000
[<ffffffff8108b931>] vprintk_default+0x1d/0x1f
[<ffffffff810f1b21>] printk+0x48/0x50
[<ffffffffa01f0032>] zram_init+0x32/0x1000 [zram]
[<ffffffff81000402>] do_one_initcall+0xe7/0x16
[<ffffffff810f1d5f>] ? do_init_module+0x27/0x1d9
[<ffffffff81093619>] ? rcu_read_lock_sched_held+0x61/0x69
[<ffffffff811330fa>] ? kmem_cache_alloc_trace+0x138/0x1d0
[<ffffffff810f1d5f>] ? do_init_module+0x27/0x1d9
[<ffffffff810f1d97>] do_init_module+0x5f/0x1d9
[<ffffffff810b2b2c>] load_module+0x19da/0x1eb8
[<ffffffff810830c4>] ? lock_acquire+0x10f/0x1a3
[<ffffffff810b3109>] SyS_init_module+0xff/0x125
[<ffffffff810b3109>] ? SyS_init_module+0xff/0x125
[<ffffffff8150e525>] entry_SYSCALL_64_fastpath+0x18/0xa8
Code: 89 05 11 cc 80 01 4c 89 35 0e cc 80 01 88 d0 83 e2 fe 83 e0 1e 88 53 0f 0f b6 c0 89 05 f2 cb 80 01 eb 9e e8 9e 5a ff ff 41 55 9d <e9> ed fe ff ff 48 83 3d 3b cd 80 01 00 c7 05 3d cd 80 01 00 00
Kernel panic - not syncing: softlockup: hung tasks
CPU: 2 PID: 418 Comm: modprobe Tainted: G O L 4.5.0-rc5-next-20160224-dbg-00012-gfedb791-dirty #187
Hardware name: ...
0000000000000000 ffff880137d03d68 ffffffff8123334d ffffffff817ad7c7
ffff880137d03df8 ffff880137d03de8 ffffffff810f124e ffffffff00000008
ffff880137d03df8 ffff880137d03d90 0000000000000000 00000000ffffffff
Call Trace:
<IRQ> [<ffffffff8123334d>] dump_stack+0x67/0x90
[<ffffffff810f124e>] panic+0xd1/0x212
[<ffffffff8107d1c1>] ? down_trylock+0x2d/0x37
[<ffffffff810c12c5>] watchdog_timer_fn+0x18e/0x1b0
[<ffffffff8109cee9>] __hrtimer_run_queues+0x1ee/0x418
[<ffffffff810c1137>] ? __touch_watchdog+0x1c/0x1c
[<ffffffff8109d2af>] hrtimer_interrupt+0xa8/0x192
[<ffffffff810354cd>] hpet_interrupt_handler+0x2a/0x31
[<ffffffff8108ccd8>] handle_irq_event_percpu+0xd2/0x2e4
[<ffffffff8108cf22>] handle_irq_event+0x38/0x56
[<ffffffff8108fc63>] handle_edge_irq+0xd9/0xfb
[<ffffffff810174ad>] handle_irq+0x1c/0x20
[<ffffffff81510795>] do_IRQ+0x85/0x1
[<ffffffff8150ee86>] common_interrupt+0x86/0x86
<EOI> [<ffffffff812c99a8>] ? vt_console_print+0x2f9/0x308
[<ffffffff8108b2d5>] ? console_unlock+0x3aa/0x456
[<ffffffff8108b2d8>] ? console_unlock+0x3ad/0x456
[<ffffffff8108b7cb>] vprintk_emit+0x44a/0x499
[<ffffffffa01f0000>] ? 0xffffffffa01f0000
[<ffffffff8108b931>] vprintk_default+0x1d/0x1f
[<ffffffff810f1b21>] printk+0x48/0x50
[<ffffffffa01f0032>] zram_init+0x32/0x1000 [zram]
[<ffffffff81000402>] do_one_initcall+0xe7/0x16c
[<ffffffff810f1d5f>] ? do_init_module+0x27/0x1d9
[<ffffffff81093619>] ? rcu_read_lock_sched_held+0x61/0x69
[<ffffffff811330fa>] ? kmem_cache_alloc_trace+0x138/0x1d0
[<ffffffff810f1d5f>] ? do_init_module+0x27/0x1d9
[<ffffffff810f1d97>] do_init_module+0x5f/0x1d9
[<ffffffff810b2b2c>] load_module+0x19da/0x1eb8
[<ffffffff810830c4>] ? lock_acquire+0x10f/0x1a3
[<ffffffff810b3109>] SyS_init_module+0xff/0x125
[<ffffffff810b3109>] ? SyS_init_module+0xff/0x125
[<ffffffff8150e525>] entry_SYSCALL_64_fastpath+0x18/0xa8
so in this test printk()->console_unlock() was performed on the same CPU.
but in general those can be different CPUs. one busy spinning doing printk(),
the other busy spinning doing console_unlock()->call_console_drivers().
-ss
next prev parent reply other threads:[~2016-02-25 5:10 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20160223131538.GA2048@swordfish>
2016-02-23 15:53 ` [printk] 34578dc67f: EIP is at vprintk_emit+0x1ea/0x600 Sergey Senozhatsky
2016-02-23 15:53 ` [lkp] " Sergey Senozhatsky
2016-02-24 1:19 ` Sergey Senozhatsky
2016-02-24 1:19 ` [lkp] " Sergey Senozhatsky
2016-02-24 1:19 ` Sergey Senozhatsky
2016-02-24 11:46 ` Petr Mladek
2016-02-24 11:46 ` [lkp] " Petr Mladek
2016-02-24 11:46 ` Petr Mladek
2016-02-24 12:50 ` Sergey Senozhatsky
2016-02-24 12:50 ` [lkp] " Sergey Senozhatsky
2016-02-24 12:50 ` Sergey Senozhatsky
2016-02-24 15:12 ` Sergey Senozhatsky
2016-02-24 15:12 ` [lkp] " Sergey Senozhatsky
2016-02-25 5:10 ` Sergey Senozhatsky [this message]
2016-02-25 5:10 ` Sergey Senozhatsky
2016-02-25 5:10 ` Sergey Senozhatsky
2016-03-02 10:30 ` Petr Mladek
2016-03-02 10:30 ` [lkp] " Petr Mladek
2016-03-02 10:41 ` Sergey Senozhatsky
2016-03-02 10:41 ` [lkp] " Sergey Senozhatsky
2016-02-23 0:55 kernel test robot
2016-02-23 0:55 ` [lkp] " kernel test robot
2016-02-23 16:55 ` Petr Mladek
2016-02-23 16:55 ` [lkp] " Petr Mladek
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160225051005.GA884@swordfish \
--to=sergey.senozhatsky.work@gmail.com \
--cc=lkp@lists.01.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.