From: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com>
To: rostedt@goodmis.org
Cc: Tejun Heo <tj@kernel.org>,
Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>,
linux-kernel@vger.kernel.org, akpm@linux-foundation.org,
linux-mm@kvack.org, xiyou.wangcong@gmail.com,
dave.hansen@intel.com, hannes@cmpxchg.org, mgorman@suse.de,
mhocko@kernel.org, pmladek@suse.com,
sergey.senozhatsky@gmail.com, vbabka@suse.cz
Subject: Re: [PATCH v3] printk: Add console owner and waiter logic to load balance console writes
Date: Wed, 8 Nov 2017 14:19:55 +0900 [thread overview]
Message-ID: <20171108051955.GA468@jagdpanzerIV> (raw)
In-Reply-To: <20171107014015.GA1822@jagdpanzerIV>
(Ccing Tejun)
On (11/07/17 10:40), Sergey Senozhatsky wrote:
> On (11/06/17 21:06), Tetsuo Handa wrote:
> > I tried your patch with warn_alloc() torture. It did not cause lockups.
> > But I felt that possibility of failing to flush last second messages (such
> > as SysRq-c or SysRq-b) to consoles has increased. Is this psychological?
>
> do I understand it correctly that there are "lost messages"?
>
> sysrq-b does an immediate emergency reboot. "normally" it's not expected
> to flush any pending logbuf messages because it's an emergency-reboot...
> but in fact it does. and this is why sysrq-b is not 100% reliable:
>
> __handle_sysrq()
> {
> pr_info("SysRq : ");
>
> op_p = __sysrq_get_key_op(key);
> pr_cont("%s\n", op_p->action_msg);
>
> op_p->handler(key);
>
> pr_cont("\n");
> }
>
> those pr_info()/pr_cont() calls can spoil sysrq-b, depending on how
> badly the system is screwed. if pr_info() deadlocks, then we never
> go to op_p->handler(key)->emergency_restart(). even if you suppress
> printing of info loglevel messages, pr_info() still goes to
> console_unlock() and prints [console_seq, log_next_seq] messages,
> if there any.
>
> there is, however, a subtle behaviour change, I think.
>
> previously, in some cases [?], pr_info("SysRq : ") from __handle_sysrq()
> would flush logbuf messages. now we have that "break out of console_unlock()
> loop even though there are pending messages, there is another CPU doing
> printk()". so sysrb-b instead of looping in console_unlock() goes directly
> to emergency_restart(). without the change it would have continued looping
> in console_unlock() and would have called emergency_restart() only when
> "console_seq == log_next_seq".
>
> now... the "subtle" part here is that we had that thing:
> - *IF* __handle_sysrq() grabs the console_sem then it will not
> return from console_unlock() until logbuf is empty. so
> concurrent printk() messages won't get lost.
>
> what we have now is:
> - if there are concurrent printk() then __handle_sysrq() does not
> fully flush the logbuf *even* if it grabbed the console_sem.
the change goes further. I did express some of my concerns during the KS,
I'll just bring them to the list.
we now always shift printing from a save - scheduleable - context to
a potentially unsafe one - atomic. by example:
CPU0 CPU1~CPU10 CPU11
console_lock()
printk();
console_unlock() IRQ
set console_owner printk()
sees console_owner
set console_waiter
sees console_waiter
break
console_unlock()
^^^^ lockup [?]
so we are forcibly moving console_unlock() from safe CPU0 to unsafe CPU11.
previously we would continue printing from a schedulable context.
another case. bare with me.
suppose that call_console_drivers() is slower than printk() -> log_store(),
which is often the case.
now assume the following:
CPU0 CPU1
IRQ IRQ
printk() printk()
printk() printk()
printk() printk()
which probably could have been handled something like this:
CPU0 CPU1
IRQ IRQ
printk() printk()
log_store()
log_store()
console_unlock()
call_console_drivers()
printk()
log_store()
goto again;
call_console_drivers()
printk()
log_store()
goto again;
call_console_drivers()
printk()
log_store()
console_unlock()
call_console_drivers()
printk()
log_store()
console_unlock()
call_console_drivers()
so CPU0 printed all the messages.
CPU1 simply did 3 * log_store()
// + spent some cycles on logbuf_lock spin_lock
// + console_sem trylock
but now every CPU will do call_console_drivers() + busy loop.
CPU0 CPU1
IRQ IRQ
printk() printk()
log_store()
log_store()
console_unlock()
set console_owner
sees console_owner
sets console_waiter
spin
call_console_drivers()
sees console_waiter
break
printk()
log_store()
console_unlock()
set console_owner
sees console_owner
sets console_waiter
spin
call_console_drivers()
sees console_waiter
break
printk()
log_store()
console_unlock()
set console_owner
sees console_owner
sets console_waiter
spin
call_console_drivers()
sees console_waiter
break
printk()
log_store()
console_unlock()
set console_owner
sees console_owner
sets console_waiter
spin
.... and so on
which not only brings the cost of call_console_drivers() from
CPU's own printk(), but it also brings the cost [busy spin] of
call_console_drivers() happening on _another_ CPU. am I wrong?
-ss
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
next prev parent reply other threads:[~2017-11-08 5:20 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-02 17:45 [PATCH v3] printk: Add console owner and waiter logic to load balance console writes Steven Rostedt
2017-11-02 22:16 ` Vlastimil Babka
2017-11-03 3:15 ` Steven Rostedt
2017-11-04 3:13 ` Sergey Senozhatsky
2017-11-03 4:09 ` John Hubbard
2017-11-03 11:21 ` Steven Rostedt
2017-11-03 11:54 ` Steven Rostedt
2017-11-03 11:54 ` Steven Rostedt
2017-11-03 21:46 ` John Hubbard
2017-11-04 3:34 ` John Hubbard
2017-11-04 8:32 ` [PATCH v3] printk: Add console owner and waiter logic to loadbalance " Tetsuo Handa
2017-11-04 8:43 ` Tetsuo Handa
2017-11-06 12:06 ` [PATCH v3] printk: Add console owner and waiter logic to load balance " Tetsuo Handa
2017-11-07 1:40 ` Sergey Senozhatsky
2017-11-07 11:05 ` [PATCH v3] printk: Add console owner and waiter logic to loadbalance " Tetsuo Handa
2017-11-08 5:19 ` Sergey Senozhatsky [this message]
2017-11-08 14:29 ` [PATCH v3] printk: Add console owner and waiter logic to load balance " Steven Rostedt
2017-11-09 0:56 ` Sergey Senozhatsky
2017-11-09 3:29 ` Steven Rostedt
2017-11-09 4:45 ` Sergey Senozhatsky
2017-11-09 5:06 ` Steven Rostedt
2017-11-09 5:33 ` Sergey Senozhatsky
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20171108051955.GA468@jagdpanzerIV \
--to=sergey.senozhatsky.work@gmail.com \
--cc=akpm@linux-foundation.org \
--cc=dave.hansen@intel.com \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mgorman@suse.de \
--cc=mhocko@kernel.org \
--cc=penguin-kernel@I-love.SAKURA.ne.jp \
--cc=pmladek@suse.com \
--cc=rostedt@goodmis.org \
--cc=sergey.senozhatsky@gmail.com \
--cc=tj@kernel.org \
--cc=vbabka@suse.cz \
--cc=xiyou.wangcong@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).