From: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: rostedt@goodmis.org, linux-kernel@vger.kernel.org,
akpm@linux-foundation.org, linux-mm@kvack.org,
xiyou.wangcong@gmail.com, dave.hansen@intel.com,
hannes@cmpxchg.org, mgorman@suse.de, mhocko@kernel.org,
pmladek@suse.com, sergey.senozhatsky@gmail.com, vbabka@suse.cz
Subject: Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes
Date: Tue, 7 Nov 2017 10:40:15 +0900 [thread overview]
Message-ID: <20171107014015.GA1822@jagdpanzerIV> (raw)
In-Reply-To: <201711062106.ADI34320.JFtOFFHOOQVLSM@I-love.SAKURA.ne.jp>
On (11/06/17 21:06), Tetsuo Handa wrote:
> I tried your patch with warn_alloc() torture. It did not cause lockups.
> But I felt that possibility of failing to flush last second messages (such
> as SysRq-c or SysRq-b) to consoles has increased. Is this psychological?
do I understand it correctly that there are "lost messages"?
sysrq-b does an immediate emergency reboot. "normally" it's not expected
to flush any pending logbuf messages because it's an emergency-reboot...
but in fact it does. and this is why sysrq-b is not 100% reliable:
__handle_sysrq()
{
pr_info("SysRq : ");
op_p = __sysrq_get_key_op(key);
pr_cont("%s\n", op_p->action_msg);
op_p->handler(key);
pr_cont("\n");
}
those pr_info()/pr_cont() calls can spoil sysrq-b, depending on how
badly the system is screwed. if pr_info() deadlocks, then we never
go to op_p->handler(key)->emergency_restart(). even if you suppress
printing of info loglevel messages, pr_info() still goes to
console_unlock() and prints [console_seq, log_next_seq] messages,
if there any.
there is, however, a subtle behaviour change, I think.
previously, in some cases [?], pr_info("SysRq : ") from __handle_sysrq()
would flush logbuf messages. now we have that "break out of console_unlock()
loop even though there are pending messages, there is another CPU doing
printk()". so sysrb-b instead of looping in console_unlock() goes directly
to emergency_restart(). without the change it would have continued looping
in console_unlock() and would have called emergency_restart() only when
"console_seq == log_next_seq".
now... the "subtle" part here is that we had that thing:
- *IF* __handle_sysrq() grabs the console_sem then it will not
return from console_unlock() until logbuf is empty. so
concurrent printk() messages won't get lost.
what we have now is:
- if there are concurrent printk() then __handle_sysrq() does not
fully flush the logbuf *even* if it grabbed the console_sem.
> ---------- vmcore-dmesg start ----------
> [ 169.016198] postgres cpuset=
> [ 169.032544] filemap_fault+0x311/0x790
> [ 169.047745] /
> [ 169.047780] mems_allowed=0
> [ 169.050577] ? xfs_ilock+0x126/0x1a0 [xfs]
> [ 169.062769] mems_allowed=0
> [ 169.065754] ? down_read_nested+0x3a/0x60
> [ 169.065783] ? xfs_ilock+0x126/0x1a0 [xfs]
> [ 189.700206] sysrq: SysRq :
> [ 189.700639] __xfs_filemap_fault.isra.19+0x3f/0xe0 [xfs]
> [ 189.700799] xfs_filemap_fault+0xb/0x10 [xfs]
> [ 189.703981] Trigger a crash
> [ 189.707032] __do_fault+0x19/0xa0
> [ 189.710008] BUG: unable to handle kernel
> [ 189.713387] __handle_mm_fault+0xbb3/0xda0
> [ 189.716473] NULL pointer dereference
> [ 189.719674] handle_mm_fault+0x14f/0x300
> [ 189.722969] at (null)
> [ 189.722974] IP: sysrq_handle_crash+0x3b/0x70
> [ 189.726156] ? handle_mm_fault+0x39/0x300
> [ 189.729537] PGD 1170dc067
> [ 189.732841] __do_page_fault+0x23e/0x4f0
> [ 189.735876] P4D 1170dc067
> [ 189.739171] do_page_fault+0x30/0x80
> [ 189.742323] PUD 1170dd067
> [ 189.745437] page_fault+0x22/0x30
> [ 189.748329] PMD 0
> [ 189.751106] RIP: 0033:0x650390
> [ 189.756583] RSP: 002b:00007fffef6b1568 EFLAGS: 00010246
> [ 189.759574] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC
> [ 189.762607] RAX: 0000000000000000 RBX: 00007fffef6b1594 RCX: 00007fae949caa20
> [ 189.765665] Modules linked in:
> [ 189.768423] RDX: 0000000000000008 RSI: 0000000000000000 RDI: 0000000000000000
> [ 189.768425] RBP: 00007fffef6b1590 R08: 0000000000000002 R09: 0000000000000010
> [ 189.771478] ip6t_rpfilter
> [ 189.774297] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
> [ 189.777016] ipt_REJECT
> [ 189.779366] R13: 0000000000000000 R14: 00007fae969787e0 R15: 0000000000000004
> [ 189.782114] nf_reject_ipv4
> [ 189.784839] CPU: 7 PID: 6959 Comm: sleep Not tainted 4.14.0-rc8+ #302
> [ 189.785113] Mem-Info:
> ---------- vmcore-dmesg end ----------
hm... wondering if this is a regression.
-ss
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
To: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>
Cc: rostedt@goodmis.org, linux-kernel@vger.kernel.org,
akpm@linux-foundation.org, linux-mm@kvack.org,
xiyou.wangcong@gmail.com, dave.hansen@intel.com,
hannes@cmpxchg.org, mgorman@suse.de, mhocko@kernel.org,
pmladek@suse.com, sergey.senozhatsky@gmail.com, vbabka@suse.cz
Subject: Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes
Date: Tue, 7 Nov 2017 10:40:15 +0900 [thread overview]
Message-ID: <20171107014015.GA1822@jagdpanzerIV> (raw)
In-Reply-To: <201711062106.ADI34320.JFtOFFHOOQVLSM@I-love.SAKURA.ne.jp>
On (11/06/17 21:06), Tetsuo Handa wrote:
> I tried your patch with warn_alloc() torture. It did not cause lockups.
> But I felt that possibility of failing to flush last second messages (such
> as SysRq-c or SysRq-b) to consoles has increased. Is this psychological?
do I understand it correctly that there are "lost messages"?
sysrq-b does an immediate emergency reboot. "normally" it's not expected
to flush any pending logbuf messages because it's an emergency-reboot...
but in fact it does. and this is why sysrq-b is not 100% reliable:
__handle_sysrq()
{
pr_info("SysRq : ");
op_p = __sysrq_get_key_op(key);
pr_cont("%s\n", op_p->action_msg);
op_p->handler(key);
pr_cont("\n");
}
those pr_info()/pr_cont() calls can spoil sysrq-b, depending on how
badly the system is screwed. if pr_info() deadlocks, then we never
go to op_p->handler(key)->emergency_restart(). even if you suppress
printing of info loglevel messages, pr_info() still goes to
console_unlock() and prints [console_seq, log_next_seq] messages,
if there any.
there is, however, a subtle behaviour change, I think.
previously, in some cases [?], pr_info("SysRq : ") from __handle_sysrq()
would flush logbuf messages. now we have that "break out of console_unlock()
loop even though there are pending messages, there is another CPU doing
printk()". so sysrb-b instead of looping in console_unlock() goes directly
to emergency_restart(). without the change it would have continued looping
in console_unlock() and would have called emergency_restart() only when
"console_seq == log_next_seq".
now... the "subtle" part here is that we had that thing:
- *IF* __handle_sysrq() grabs the console_sem then it will not
return from console_unlock() until logbuf is empty. so
concurrent printk() messages won't get lost.
what we have now is:
- if there are concurrent printk() then __handle_sysrq() does not
fully flush the logbuf *even* if it grabbed the console_sem.
> ---------- vmcore-dmesg start ----------
> [ 169.016198] postgres cpuset=
> [ 169.032544] filemap_fault+0x311/0x790
> [ 169.047745] /
> [ 169.047780] mems_allowed=0
> [ 169.050577] ? xfs_ilock+0x126/0x1a0 [xfs]
> [ 169.062769] mems_allowed=0
> [ 169.065754] ? down_read_nested+0x3a/0x60
> [ 169.065783] ? xfs_ilock+0x126/0x1a0 [xfs]
> [ 189.700206] sysrq: SysRq :
> [ 189.700639] __xfs_filemap_fault.isra.19+0x3f/0xe0 [xfs]
> [ 189.700799] xfs_filemap_fault+0xb/0x10 [xfs]
> [ 189.703981] Trigger a crash
> [ 189.707032] __do_fault+0x19/0xa0
> [ 189.710008] BUG: unable to handle kernel
> [ 189.713387] __handle_mm_fault+0xbb3/0xda0
> [ 189.716473] NULL pointer dereference
> [ 189.719674] handle_mm_fault+0x14f/0x300
> [ 189.722969] at (null)
> [ 189.722974] IP: sysrq_handle_crash+0x3b/0x70
> [ 189.726156] ? handle_mm_fault+0x39/0x300
> [ 189.729537] PGD 1170dc067
> [ 189.732841] __do_page_fault+0x23e/0x4f0
> [ 189.735876] P4D 1170dc067
> [ 189.739171] do_page_fault+0x30/0x80
> [ 189.742323] PUD 1170dd067
> [ 189.745437] page_fault+0x22/0x30
> [ 189.748329] PMD 0
> [ 189.751106] RIP: 0033:0x650390
> [ 189.756583] RSP: 002b:00007fffef6b1568 EFLAGS: 00010246
> [ 189.759574] Oops: 0002 [#1] SMP DEBUG_PAGEALLOC
> [ 189.762607] RAX: 0000000000000000 RBX: 00007fffef6b1594 RCX: 00007fae949caa20
> [ 189.765665] Modules linked in:
> [ 189.768423] RDX: 0000000000000008 RSI: 0000000000000000 RDI: 0000000000000000
> [ 189.768425] RBP: 00007fffef6b1590 R08: 0000000000000002 R09: 0000000000000010
> [ 189.771478] ip6t_rpfilter
> [ 189.774297] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
> [ 189.777016] ipt_REJECT
> [ 189.779366] R13: 0000000000000000 R14: 00007fae969787e0 R15: 0000000000000004
> [ 189.782114] nf_reject_ipv4
> [ 189.784839] CPU: 7 PID: 6959 Comm: sleep Not tainted 4.14.0-rc8+ #302
> [ 189.785113] Mem-Info:
> ---------- vmcore-dmesg end ----------
hm... wondering if this is a regression.
-ss
next prev parent reply other threads:[~2017-11-07 1:40 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-02 17:45 [PATCH v3] printk: Add console owner and waiter logic to load balance console writes Steven Rostedt
2017-11-02 22:16 ` Vlastimil Babka
2017-11-02 22:16 ` Vlastimil Babka
2017-11-03 3:15 ` Steven Rostedt
2017-11-03 3:15 ` Steven Rostedt
2017-11-04 3:13 ` Sergey Senozhatsky
2017-11-04 3:13 ` Sergey Senozhatsky
2017-11-03 4:09 ` John Hubbard
2017-11-03 11:21 ` Steven Rostedt
2017-11-03 11:21 ` Steven Rostedt
2017-11-03 11:54 ` Steven Rostedt
2017-11-03 11:54 ` Steven Rostedt
2017-11-03 11:54 ` Steven Rostedt
2017-11-03 11:54 ` Steven Rostedt
2017-11-03 21:46 ` John Hubbard
2017-11-04 3:34 ` John Hubbard
2017-11-04 8:32 ` [PATCH v3] printk: Add console owner and waiter logic to loadbalance " Tetsuo Handa
2017-11-04 8:32 ` Tetsuo Handa
2017-11-04 8:43 ` Tetsuo Handa
2017-11-04 8:43 ` Tetsuo Handa
2017-11-06 12:06 ` [PATCH v3] printk: Add console owner and waiter logic to load balance " Tetsuo Handa
2017-11-06 12:06 ` Tetsuo Handa
2017-11-07 1:40 ` Sergey Senozhatsky [this message]
2017-11-07 1:40 ` Sergey Senozhatsky
2017-11-07 11:05 ` [PATCH v3] printk: Add console owner and waiter logic to loadbalance " Tetsuo Handa
2017-11-07 11:05 ` Tetsuo Handa
2017-11-08 5:19 ` [PATCH v3] printk: Add console owner and waiter logic to load balance " Sergey Senozhatsky
2017-11-08 5:19 ` Sergey Senozhatsky
2017-11-08 14:29 ` Steven Rostedt
2017-11-08 14:29 ` Steven Rostedt
2017-11-09 0:56 ` Sergey Senozhatsky
2017-11-09 0:56 ` Sergey Senozhatsky
2017-11-09 3:29 ` Steven Rostedt
2017-11-09 3:29 ` Steven Rostedt
2017-11-09 4:45 ` Sergey Senozhatsky
2017-11-09 4:45 ` Sergey Senozhatsky
2017-11-09 5:06 ` Steven Rostedt
2017-11-09 5:06 ` Steven Rostedt
2017-11-09 5:33 ` Sergey Senozhatsky
2017-11-09 5:33 ` Sergey Senozhatsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171107014015.GA1822@jagdpanzerIV \
--to=sergey.senozhatsky.work@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=dave.hansen@intel.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@kernel.org \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
--cc=pmladek@suse.com \
--cc=rostedt@goodmis.org \
--cc=sergey.senozhatsky@gmail.com \
--cc=vbabka@suse.cz \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.